Academic literature on the topic 'Self-supervised learninig'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Self-supervised learninig.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Self-supervised learninig"

1

Zhao, Qingyu, Zixuan Liu, Ehsan Adeli, and Kilian M. Pohl. "Longitudinal self-supervised learning." Medical Image Analysis 71 (July 2021): 102051. http://dx.doi.org/10.1016/j.media.2021.102051.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Wang, Fei, and Changshui Zhang. "Robust self-tuning semi-supervised learning." Neurocomputing 70, no. 16-18 (October 2007): 2931–39. http://dx.doi.org/10.1016/j.neucom.2006.11.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hrycej, Tomas. "Supporting supervised learning by self-organization." Neurocomputing 4, no. 1-2 (February 1992): 17–30. http://dx.doi.org/10.1016/0925-2312(92)90040-v.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Shin, Sungho, Jongwon Kim, Yeonguk Yu, Seongju Lee, and Kyoobin Lee. "Self-Supervised Transfer Learning from Natural Images for Sound Classification." Applied Sciences 11, no. 7 (March 29, 2021): 3043. http://dx.doi.org/10.3390/app11073043.

Full text
Abstract:
We propose the implementation of transfer learning from natural images to audio-based images using self-supervised learning schemes. Through self-supervised learning, convolutional neural networks (CNNs) can learn the general representation of natural images without labels. In this study, a convolutional neural network was pre-trained with natural images (ImageNet) via self-supervised learning; subsequently, it was fine-tuned on the target audio samples. Pre-training with the self-supervised learning scheme significantly improved the sound classification performance when validated on the following benchmarks: ESC-50, UrbanSound8k, and GTZAN. The network pre-trained via self-supervised learning achieved a similar level of accuracy as those pre-trained using a supervised method that require labels. Therefore, we demonstrated that transfer learning from natural images contributes to improvements in audio-related tasks, and self-supervised learning with natural images is adequate for pre-training scheme in terms of simplicity and effectiveness.
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Yuanyuan, and Qianqian Liu. "Research on Self-Supervised Comparative Learning for Computer Vision." Journal of Electronic Research and Application 5, no. 3 (August 17, 2021): 5–17. http://dx.doi.org/10.26689/jera.v5i3.2320.

Full text
Abstract:
In recent years, self-supervised learning which does not require a large number of manual labels generate supervised signals through the data itself to attain the characterization learning of samples. Self-supervised learning solves the problem of learning semantic features from unlabeled data, and realizes pre-training of models in large data sets. Its significant advantages have been extensively studied by scholars in recent years. There are usually three types of self-supervised learning: “Generative, Contrastive, and Generative-Contrastive.” The model of the comparative learning method is relatively simple, and the performance of the current downstream task is comparable to that of the supervised learning method. Therefore, we propose a conceptual analysis framework: data augmentation pipeline, architectures, pretext tasks, comparison methods, semi-supervised fine-tuning. Based on this conceptual framework, we qualitatively analyze the existing comparative self-supervised learning methods for computer vision, and then further analyze its performance at different stages, and finally summarize the research status of self-supervised comparative learning methods in other fields.
APA, Harvard, Vancouver, ISO, and other styles
6

Jaiswal, Ashish, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee, and Fillia Makedon. "A Survey on Contrastive Self-Supervised Learning." Technologies 9, no. 1 (December 28, 2020): 2. http://dx.doi.org/10.3390/technologies9010002.

Full text
Abstract:
Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.
APA, Harvard, Vancouver, ISO, and other styles
7

ITO, Seiya, Naoshi KANEKO, and Kazuhiko SUMI. "Self-Supervised Learning for Multi-View Stereo." Journal of the Japan Society for Precision Engineering 86, no. 12 (December 5, 2020): 1042–50. http://dx.doi.org/10.2493/jjspe.86.1042.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Tenorio, M. F., and W. T. Lee. "Self-organizing network for optimum supervised learning." IEEE Transactions on Neural Networks 1, no. 1 (March 1990): 100–110. http://dx.doi.org/10.1109/72.80209.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Florence, Peter, Lucas Manuelli, and Russ Tedrake. "Self-Supervised Correspondence in Visuomotor Policy Learning." IEEE Robotics and Automation Letters 5, no. 2 (April 2020): 492–99. http://dx.doi.org/10.1109/lra.2019.2956365.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Liu, Chicheng, Libin Song, Jiwen Zhang, Ken Chen, and Jing Xu. "Self-Supervised Learning for Specified Latent Representation." IEEE Transactions on Fuzzy Systems 28, no. 1 (January 2020): 47–59. http://dx.doi.org/10.1109/tfuzz.2019.2904237.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Self-supervised learninig"

1

Vančo, Timotej. "Self-supervised učení v aplikacích počítačového vidění." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442510.

Full text
Abstract:
The aim of the diploma thesis is to make research of the self-supervised learning in computer vision applications, then to choose a suitable test task with an extensive data set, apply self-supervised methods and evaluate. The theoretical part of the work is focused on the description of methods in computer vision, a detailed description of neural and convolution networks and an extensive explanation and division of self-supervised methods. Conclusion of the theoretical part is devoted to practical applications of the Self-supervised methods in practice. The practical part of the diploma thesis deals with the description of the creation of code for working with datasets and the application of the SSL methods Rotation, SimCLR, MoCo and BYOL in the role of classification and semantic segmentation. Each application of the method is explained in detail and evaluated for various parameters on the large STL10 dataset. Subsequently, the success of the methods is evaluated for different datasets and the limiting conditions in the classification task are named. The practical part concludes with the application of SSL methods for pre-training the encoder in the application of semantic segmentation with the Cityscapes dataset.
APA, Harvard, Vancouver, ISO, and other styles
2

Khan, Umair. "Self-supervised deep learning approaches to speaker recognition." Doctoral thesis, Universitat Politècnica de Catalunya, 2021. http://hdl.handle.net/10803/671496.

Full text
Abstract:
In speaker recognition, i-vectors have been the state-of-the-art unsupervised technique over the last few years, whereas x-vectors is becoming the state-of-the-art supervised technique, these days. Recent advances in Deep Learning (DL) approaches to speaker recognition have improved the performance but are constrained to the need of labels for the background data. In practice, labeled background data is not easily accessible, especially when large training data is required. In i-vector based speaker recognition, cosine and Probabilistic Linear Discriminant Analysis (PLDA) are the two basic scoring techniques. Cosine scoring is unsupervised whereas PLDA parameters are typically trained using speaker-labeled background data. This makes a big performance gap between these two scoring techniques. The question is: how to fill this performance gap without using speaker labels for the background data? In this thesis, the above mentioned problem has been addressed using DL approaches without using and/or limiting the use of labeled background data. Three DL based proposals have been made. In the first proposal, a Restricted Boltzmann Machine (RBM) vector representation of speech is proposed for the tasks of speaker clustering and tracking in TV broadcast shows. This representation is referred to as RBM vector. The experiments on AGORA database show that in speaker clustering the RBM vectors gain a relative improvement of 12% in terms of Equal Impurity (EI). For speaker tracking task RBM vectors are used only in the speaker identification part, where the relative improvement in terms of Equal Error Rate (EER) is 11% and 7% using cosine and PLDA scoring, respectively. In the second proposal, DL approaches are proposed in order to increase the discriminative power of i-vectors in speaker verification. We have proposed the use of autoencoder in several ways. Firstly, an autoencoder will be used as a pre-training for a Deep Neural Network (DNN) using a large amount of unlabeled background data. Then, a DNN classifier will be trained using relatively small labeled data. Secondly, an autoencoder will be trained to transform i-vectors into a new representation to increase their discriminative power. The training will be carried out based on the nearest neighbor i-vectors which will be chosen in an unsupervised manner. The evaluation was performed on VoxCeleb-1 database. The results show that using the first system, we gain a relative improvement of 21% in terms of EER, over i-vector/PLDA. Whereas, using the second system, a relative improvement of 42% is gained. If we use the background data in the testing part, a relative improvement of 53% is gained. In the third proposal, we will train a self-supervised end-to-end speaker verification system. The idea is to utilize impostor samples along with the nearest neighbor samples to make client/impostor pairs in an unsupervised manner. The architecture will be based on a Convolutional Neural Network (CNN) encoder, trained as a siamese network with two branch networks. Another network with three branches will also be trained using triplet loss, in order to extract unsupervised speaker embeddings. The experimental results show that both the end-to-end system and the speaker embeddings, despite being unsupervised, show a comparable performance to the supervised baseline. Moreover, their score combination can further improve the performance. The proposed approaches for speaker verification have respective pros and cons. The best result was obtained using the nearest neighbor autoencoder with a disadvantage of relying on background i-vectors in the testing. On the contrary, the autoencoder pre-training for DNN is not bound by this factor but is a semi-supervised approach. The third proposal is free from both these constraints and performs pretty reasonably. It is a self-supervised approach and it does not require the background i-vectors in the testing phase.
Los avances recientes en Deep Learning (DL) para el reconocimiento del hablante están mejorado el rendimiento de los sistemas tradicionales basados en i-vectors. En el reconocimiento de locutor basado en i-vectors, la distancia coseno y el análisis discriminante lineal probabilístico (PLDA) son las dos técnicas más usadas de puntuación. La primera no es supervisada, pero la segunda necesita datos etiquetados por el hablante, que no son siempre fácilmente accesibles en la práctica. Esto crea una gran brecha de rendimiento entre estas dos técnicas de puntuación. La pregunta es: ¿cómo llenar esta brecha de rendimiento sin usar etiquetas del hablante en los datos de background? En esta tesis, el problema anterior se ha abordado utilizando técnicas de DL sin utilizar y/o limitar el uso de datos etiquetados. Se han realizado tres propuestas basadas en DL. En la primera, se propone una representación vectorial de voz basada en la máquina de Boltzmann restringida (RBM) para las tareas de agrupación de hablantes y seguimiento de hablantes en programas de televisión. Los experimentos en la base de datos AGORA, muestran que en agrupación de hablantes los vectores RBM suponen una mejora relativa del 12%. Y, por otro lado, en seguimiento del hablante, los vectores RBM,utilizados solo en la etapa de identificación del hablante, muestran una mejora relativa del 11% (coseno) y 7% (PLDA). En la segunda, se utiliza DL para aumentar el poder discriminativo de los i-vectors en la verificación del hablante. Se ha propuesto el uso del autocodificador de varias formas. En primer lugar, se utiliza un autocodificador como preentrenamiento de una red neuronal profunda (DNN) utilizando una gran cantidad de datos de background sin etiquetar, para posteriormente entrenar un clasificador DNN utilizando un conjunto reducido de datos etiquetados. En segundo lugar, se entrena un autocodificador para transformar i-vectors en una nueva representación para aumentar el poder discriminativo de los i-vectors. El entrenamiento se lleva a cabo en base a los i-vectors vecinos más cercanos, que se eligen de forma no supervisada. La evaluación se ha realizado con la base de datos VoxCeleb-1. Los resultados muestran que usando el primer sistema obtenemos una mejora relativa del 21% sobre i-vectors, mientras que usando el segundo sistema, se obtiene una mejora relativa del 42%. Además, si utilizamos los datos de background en la etapa de prueba, se obtiene una mejora relativa del 53%. En la tercera, entrenamos un sistema auto-supervisado de verificación de locutor de principio a fin. Utilizamos impostores junto con los vecinos más cercanos para formar pares cliente/impostor sin supervisión. La arquitectura se basa en un codificador de red neuronal convolucional (CNN) que se entrena como una red siamesa con dos ramas. Además, se entrena otra red con tres ramas utilizando la función de pérdida triplete para extraer embeddings de locutores. Los resultados muestran que tanto el sistema de principio a fin como los embeddings de locutores, a pesar de no estar supervisados, tienen un rendimiento comparable a una referencia supervisada. Cada uno de los enfoques propuestos tienen sus pros y sus contras. El mejor resultado se obtuvo utilizando el autocodificador con el vecino más cercano, con la desventaja de que necesita los i-vectors de background en el test. El uso del preentrenamiento del autocodificador para DNN no tiene este problema, pero es un enfoque semi-supervisado, es decir, requiere etiquetas de hablantes solo de una parte pequeña de los datos de background. La tercera propuesta no tienes estas dos limitaciones y funciona de manera razonable. Es un en
APA, Harvard, Vancouver, ISO, and other styles
3

Korecki, John Nicholas. "Semi-Supervised Self-Learning on Imbalanced Data Sets." Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1686.

Full text
Abstract:
Semi-supervised self-learning algorithms have been shown to improve classifier accuracy under a variety of conditions. In this thesis, semi-supervised self-learning using ensembles of random forests and fuzzy c-means clustering similarity was applied to three data sets to show where improvement is possible over random forests alone. Two of the data sets are emulations of large simulations in which the data may be distributed. Additionally, the ratio of majority to minority class examples in the training set was altered to examine the effect of training set bias on performance when applying the semi-supervised algorithm.
APA, Harvard, Vancouver, ISO, and other styles
4

Govindarajan, Hariprasath. "Self-Supervised Representation Learning for Content Based Image Retrieval." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166223.

Full text
Abstract:
Automotive technologies and fully autonomous driving have seen a tremendous growth in recent times and have benefitted from extensive deep learning research. State-of-the-art deep learning methods are largely supervised and require labelled data for training. However, the annotation process for image data is time-consuming and costly in terms of human efforts. It is of interest to find informative samples for labelling by Content Based Image Retrieval (CBIR). Generally, a CBIR method takes a query image as input and returns a set of images that are semantically similar to the query image. The image retrieval is achieved by transforming images to feature representations in a latent space, where it is possible to reason about image similarity in terms of image content. In this thesis, a self-supervised method is developed to learn feature representations of road scenes images. The self-supervised method learns feature representations for images by adapting intermediate convolutional features from an existing deep Convolutional Neural Network (CNN). A contrastive approach based on Noise Contrastive Estimation (NCE) is used to train the feature learning model. For complex images like road scenes where mutiple image aspects can occur simultaneously, it is important to embed all the salient image aspects in the feature representation. To achieve this, the output feature representation is obtained as an ensemble of feature embeddings which are learned by focusing on different image aspects. An attention mechanism is incorporated to encourage each ensemble member to focus on different image aspects. For comparison, a self-supervised model without attention is considered and a simple dimensionality reduction approach using SVD is treated as the baseline. The methods are evaluated on nine different evaluation datasets using CBIR performance metrics. The datasets correspond to different image aspects and concern the images at different spatial levels - global, semi-global and local. The feature representations learned by self-supervised methods are shown to perform better than the SVD approach. Taking into account that no labelled data is required for training, learning representations for road scenes images using self-supervised methods appear to be a promising direction. Usage of multiple query images to emphasize a query intention is investigated and a clear improvement in CBIR performance is observed. It is inconclusive whether the addition of an attentive mechanism impacts CBIR performance. The attention method shows some positive signs based on qualitative analysis and also performs better than other methods for one of the evaluation datasets containing a local aspect. This method for learning feature representations is promising but requires further research involving more diverse and complex image aspects.
APA, Harvard, Vancouver, ISO, and other styles
5

Zangeneh, Kamali Fereidoon. "Self-supervised learning of camera egomotion using epipolar geometry." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286286.

Full text
Abstract:
Visual odometry is one of the prevalent techniques for the positioning of autonomous agents equipped with cameras. Several recent works in this field have in various ways attempted to exploit the capabilities of deep neural networks to improve the performance of visual odometry solutions. One of such approaches is using an end-to-end learning-based solution to infer the egomotion of the camera from a sequence of input images. The state of the art end-to-end solutions employ a common self-supervised training strategy that minimises a notion of photometric error formed by the view synthesis of the input images. As this error is a function of the predicted egomotion, its minimisation corresponds to the learning of egomotion estimation by the network. However, this also requires the depth information of the images, for which an additional depth estimation network is introduced in training. This implies that for end-to-end learning of camera egomotion, a set of parameters are required to be learned, which are not used in inference. In this work, we propose a novel learning strategy using epipolar geometry, which does not rely on depth estimations. Empirical evaluation of our method demonstrates its comparable performance to the baseline work that relies on explicit depth estimations for training.
Visuell odometri är en av de vanligast förekommande teknikerna för positionering av autonoma agenter utrustade med kameror. Flera senare arbeten inom detta område har på olika sätt försökt utnyttja kapaciteten hos djupa neurala nätverk för att förbättra prestandan hos lösningar baserade på visuell odometri. Ett av dessa tillvägagångssätt består i att använda en inlärningsbaserad lösning för att härleda kamerans rörelse utifrån en sekvens av bilder. Gemensamt för de flesta senare lösningar är en självövervakande träningsstrategi som minimerar det uppfattade fotometriska fel som uppskattas genom att syntetisera synvinkeln utifrån givna bildsekvenser. Eftersom detta fel är en funktion av den estimerade kamerarörelsen motsvarar minimering av felet att nätverket lär sig uppskatta kamerarörelsen. Denna inlärning kräver dock även information om djupet i bilderna, vilket fås genom att introducera ett nätverk specifikt för estimering av djup. Detta innebär att för uppskattning av kamerans rörelse krävs inlärning av ytterligare en uppsättning parametrar vilka inte används i den slutgiltiga uppskattningen. I detta arbete föreslår vi en ny inlärningsstrategi baserad på epipolär geometri, vilket inte beror på djupskattningar. Empirisk utvärdering av vår metod visar att dess resultat är jämförbara med tidigare metoder som använder explicita djupskattningar för träning.
APA, Harvard, Vancouver, ISO, and other styles
6

Sharma, Vivek [Verfasser], and R. [Akademischer Betreuer] Stiefelhagen. "Self-supervised Face Representation Learning / Vivek Sharma ; Betreuer: R. Stiefelhagen." Karlsruhe : KIT-Bibliothek, 2020. http://d-nb.info/1212512545/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Coen, Michael Harlan. "Multimodal dynamics : self-supervised learning in perceptual and motor systems." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/34022.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (leaves 178-192).
This thesis presents a self-supervised framework for perceptual and motor learning based upon correlations in different sensory modalities. The brain and cognitive sciences have gathered an enormous body of neurological and phenomenological evidence in the past half century demonstrating the extraordinary degree of interaction between sensory modalities during the course of ordinary perception. We develop a framework for creating artificial perceptual systems that draws on these findings, where the primary architectural motif is the cross-modal transmission of perceptual information to enhance each sensory channel individually. We present self-supervised algorithms for learning perceptual grounding, intersensory influence, and sensorymotor coordination, which derive training signals from internal cross-modal correlations rather than from external supervision. Our goal is to create systems that develop by interacting with the world around them, inspired by development in animals. We demonstrate this framework with: (1) a system that learns the number and structure of vowels in American English by simultaneously watching and listening to someone speak. The system then cross-modally clusters the correlated auditory and visual data.
(cont.) It has no advance linguistic knowledge and receives no information outside of its sensory channels. This work is the first unsupervised acquisition of phonetic structure of which we are aware, outside of that done by human infants. (2) a system that learns to sing like a zebra finch, following the developmental stages of a juvenile zebra finch. It first learns the song of an adult male and then listens to its own initially nascent attempts at mimicry through an articulatory synthesizer. In acquiring the birdsong to which it was initially exposed, this system demonstrates self-supervised sensorimotor learning. It also demonstrates afferent and efferent equivalence - the system learns motor maps with the same computational framework used for learning sensory maps.
by Michael Harlan Coen.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
8

Nyströmer, Carl. "Musical Instrument Activity Detection using Self-Supervised Learning and Domain Adaptation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280810.

Full text
Abstract:
With the ever growing media and music catalogs, tools that search and navigate this data are important. For more complex search queries, meta-data is needed, but to manually label the vast amounts of new content is impossible. In this thesis, automatic labeling of musical instrument activities in song mixes is investigated, with a focus on ways to alleviate the lack of annotated data for instrument activity detection models. Two methods for alleviating the problem of small amounts of data are proposed and evaluated. Firstly, a self-supervised approach based on automatic labeling and mixing of randomized instrument stems is investigated. Secondly, a domain-adaptation approach that trains models on sampled MIDI files for instrument activity detection on recorded music is explored. The self-supervised approach yields better results compared to the baseline and points to the fact that deep learning models can learn instrument activity detection without an intrinsic musical structure in the audio mix. The domain-adaptation models trained solely on sampled MIDI files performed worse than the baseline, however using MIDI data in conjunction with recorded music boosted the performance. A hybrid model combining both self-supervised learning and domain adaptation by using both sampled MIDI data and recorded music produced the best results overall.
I och med de ständigt växande media- och musikkatalogerna krävs verktyg för att söka och navigera i dessa. För mer komplexa sökförfrågningar så behövs det metadata, men att manuellt annotera de enorma mängderna av ny data är omöjligt. I denna uppsats undersöks automatisk annotering utav instrumentsaktivitet inom musik, med ett fokus på bristen av annoterad data för modellerna för instrumentaktivitetsigenkänning. Två metoder för att komma runt bristen på data föreslås och undersöks. Den första metoden bygger på självövervakad inlärning baserad på automatisk annotering och slumpartad mixning av olika instrumentspår. Den andra metoden använder domänadaption genom att träna modeller på samplade MIDI-filer för detektering av instrument i inspelad musik. Metoden med självövervakning gav bättre resultat än baseline och pekar på att djupinlärningsmodeller kan lära sig instrumentigenkänning trots att ljudmixarna saknar musikalisk struktur. Domänadaptionsmodellerna som endast var tränade på samplad MIDI-data presterade sämre än baseline, men att använda MIDI-data tillsammans med data från inspelad musik gav förbättrade resultat. En hybridmodell som kombinerade både självövervakad inlärning och domänadaption genom att använda både samplad MIDI-data och inspelad musik gav de bästa resultaten totalt.
APA, Harvard, Vancouver, ISO, and other styles
9

Nett, Ryan. "Dataset and Evaluation of Self-Supervised Learning for Panoramic Depth Estimation." DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2234.

Full text
Abstract:
Depth detection is a very common computer vision problem. It shows up primarily in robotics, automation, or 3D visualization domains, as it is essential for converting images to point clouds. One of the poster child applications is self driving cars. Currently, the best methods for depth detection are either very expensive, like LIDAR, or require precise calibration, like stereo cameras. These costs have given rise to attempts to detect depth from a monocular camera (a single camera). While this is possible, it is harder than LIDAR or stereo methods since depth can't be measured from monocular images, it has to be inferred. A good example is covering one eye: you still have some idea how far away things are, but it's not exact. Neural networks are a natural fit for this. Here, we build on previous neural network methods by applying a recent state of the art model to panoramic images in addition to pinhole ones and performing a comparative evaluation. First, we create a simulated depth detection dataset that lends itself to panoramic comparisons and contains pre-made cylindrical and spherical panoramas. We then modify monodepth2 to support cylindrical and cubemap panoramas, incorporating current best practices for depth detection on those panorama types, and evaluate its performance for each type of image using our dataset. We also consider the resources used in training and other qualitative factors.
APA, Harvard, Vancouver, ISO, and other styles
10

Baleia, José Rodrigo Ferreira. "Haptic robot-environment interaction for self-supervised learning in ground mobility." Master's thesis, Faculdade de Ciências e Tecnologia, 2014. http://hdl.handle.net/10362/12475.

Full text
Abstract:
Dissertação para obtenção do Grau de Mestre em Engenharia Eletrotécnica e de Computadores
This dissertation presents a system for haptic interaction and self-supervised learning mechanisms to ascertain navigation affordances from depth cues. A simple pan-tilt telescopic arm and a structured light sensor, both fitted to the robot’s body frame, provide the required haptic and depth sensory feedback. The system aims at incrementally develop the ability to assess the cost of navigating in natural environments. For this purpose the robot learns a mapping between the appearance of objects, given sensory data provided by the sensor, and their bendability, perceived by the pan-tilt telescopic arm. The object descriptor, representing the object in memory and used for comparisons with other objects, is rich for a robust comparison and simple enough to allow for fast computations. The output of the memory learning mechanism allied with the haptic interaction point evaluation prioritize interaction points to increase the confidence on the interaction and correctly identifying obstacles, reducing the risk of the robot getting stuck or damaged. If the system concludes that the object is traversable, the environment change detection system allows the robot to overcome it. A set of field trials show the ability of the robot to progressively learn which elements of environment are traversable.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Self-supervised learninig"

1

Munro, Paul. Self-supervised learning of concepts by single units and "weakly local" representations. Pittsburgh, PA: School of Library and Information Science, University of Pittsburgh, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Munro, Paul. Self-supervised learning of concepts by single units and "weakly local" representations. School of Library and Information Science, University of Pittsburgh, 1988.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Self-supervised learninig"

1

Nedelkoski, Sasho, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso, and Odej Kao. "Self-supervised Log Parsing." In Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track, 122–38. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-67667-4_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Jawed, Shayan, Josif Grabocka, and Lars Schmidt-Thieme. "Self-supervised Learning for Semi-supervised Time Series Classification." In Advances in Knowledge Discovery and Data Mining, 499–511. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-47426-3_39.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Jamaludin, Amir, Timor Kadir, and Andrew Zisserman. "Self-supervised Learning for Spinal MRIs." In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, 294–302. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-67558-9_34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Fengbei, Yu Tian, Filipe R. Cordeiro, Vasileios Belagiannis, Ian Reid, and Gustavo Carneiro. "Self-supervised Mean Teacher for Semi-supervised Chest X-Ray Classification." In Machine Learning in Medical Imaging, 426–36. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87589-3_44.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Si, Chenyang, Xuecheng Nie, Wei Wang, Liang Wang, Tieniu Tan, and Jiashi Feng. "Adversarial Self-supervised Learning for Semi-supervised 3D Action Recognition." In Computer Vision – ECCV 2020, 35–51. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58571-6_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Zhang, Ruifei, Sishuo Liu, Yizhou Yu, and Guanbin Li. "Self-supervised Correction Learning for Semi-supervised Biomedical Image Segmentation." In Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, 134–44. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87196-3_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Valvano, Gabriele, Andrea Leo, and Sotirios A. Tsaftaris. "Self-supervised Multi-scale Consistency for Weakly Supervised Segmentation Learning." In Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health, 14–24. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87722-4_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Feng, Ruibin, Zongwei Zhou, Michael B. Gotway, and Jianming Liang. "Parts2Whole: Self-supervised Contrastive Learning via Reconstruction." In Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning, 85–95. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60548-3_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Cervera, Enrique, and Angel P. Pobil. "Multiple self-organizing maps for supervised learning." In Lecture Notes in Computer Science, 345–52. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995. http://dx.doi.org/10.1007/3-540-59497-3_195.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Karlos, Stamatis, Nikos Fazakis, Sotiris Kotsiantis, and Kyriakos Sgarbas. "Self-Train LogitBoost for Semi-supervised Learning." In Engineering Applications of Neural Networks, 139–48. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23983-5_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Self-supervised learninig"

1

An, Yuexuan, Hui Xue, Xingyu Zhao, and Lu Zhang. "Conditional Self-Supervised Learning for Few-Shot Classification." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/295.

Full text
Abstract:
How to learn a transferable feature representation from limited examples is a key challenge for few-shot classification. Self-supervision as an auxiliary task to the main supervised few-shot task is considered to be a conceivable way to solve the problem since self-supervision can provide additional structural information easily ignored by the main task. However, learning a good representation by traditional self-supervised methods is usually dependent on large training samples. In few-shot scenarios, due to the lack of sufficient samples, these self-supervised methods might learn a biased representation, which more likely leads to the wrong guidance for the main tasks and finally causes the performance degradation. In this paper, we propose conditional self-supervised learning (CSS) to use auxiliary information to guide the representation learning of self-supervised tasks. Specifically, CSS leverages supervised information as prior knowledge to shape and improve the learning feature manifold of self-supervision without auxiliary unlabeled data, so as to reduce representation bias and mine more effective semantic information. Moreover, CSS exploits more meaningful information through supervised and the improved self-supervised learning respectively and integrates the information into a unified distribution, which can further enrich and broaden the original representation. Extensive experiments demonstrate that our proposed method without any fine-tuning can achieve a significant accuracy improvement on the few-shot classification scenarios compared to the state-of-the-art few-shot learning methods.
APA, Harvard, Vancouver, ISO, and other styles
2

Beyer, Lucas, Xiaohua Zhai, Avital Oliver, and Alexander Kolesnikov. "S4L: Self-Supervised Semi-Supervised Learning." In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019. http://dx.doi.org/10.1109/iccv.2019.00156.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Basaj, Dominika, Witold Oleszkiewicz, Igor Sieradzki, Michał Górszczak, Barbara Rychalska, Tomasz Trzcinski, and Bartosz Zieliński. "Explaining Self-Supervised Image Representations with Visual Probing." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/82.

Full text
Abstract:
Recently introduced self-supervised methods for image representation learning provide on par or superior results to their fully supervised competitors, yet the corresponding efforts to explain the self-supervised approaches lag behind. Motivated by this observation, we introduce a novel visual probing framework for explaining the self-supervised models by leveraging probing tasks employed previously in natural language processing. The probing tasks require knowledge about semantic relationships between image parts. Hence, we propose a systematic approach to obtain analogs of natural language in vision, such as visual words, context, and taxonomy. We show the effectiveness and applicability of those analogs in the context of explaining self-supervised representations. Our key findings emphasize that relations between language and vision can serve as an effective yet intuitive tool for discovering how machine learning models work, independently of data modality. Our work opens a plethora of research pathways towards more explainable and transparent AI.
APA, Harvard, Vancouver, ISO, and other styles
4

Song, Jinwoo, and Young B. Moon. "Infill Defective Detection System Augmented by Semi-Supervised Learning." In ASME 2020 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/imece2020-23249.

Full text
Abstract:
Abstract In an effort to identify cyber-attacks on infill structures, detection systems based on supervised learning have been attempted in Additive Manufacturing (AM) security investigations. However, supervised learning requires a myriad of training data sets to achieve acceptable detection accuracy. Besides, since it is impossible to train for unprecedented defective types, the detection systems cannot guarantee robustness against unforeseen attacks. To overcome such disadvantages of supervised learning, This paper presents infill defective detection system (IDDS) augmented by semi-supervised learning. Semi-supervised learning allows classifying a sheer volume of unlabeled data sets by training a comparably small number of labeled data sets. Additionally, IDDS exploits self-training to increase the robustness against various defective types that are not pre-trained. IDDS consists of the feature extraction, pre-training, self-training. To validate the usefulness of IDDS, five defective types were designed and tested with IDDS, which was trained by only normal labeled data sets. The results are compared with the basis accuracy from the perceptron network model with supervised learning.
APA, Harvard, Vancouver, ISO, and other styles
5

Wu, Jiawei, Xin Wang, and William Yang Wang. "Self-Supervised Dialogue Learning." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1375.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Li, Pengyong, Jun Wang, Ziliang Li, Yixuan Qiao, Xianggen Liu, Fei Ma, Peng Gao, Sen Song, and Guotong Xie. "Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/371.

Full text
Abstract:
Self-supervised learning has gradually emerged as a powerful technique for graph representation learning. However, transferable, generalizable, and robust representation learning on graph data still remains a challenge for pre-training graph neural networks. In this paper, we propose a simple and effective self-supervised pre-training strategy, named Pairwise Half-graph Discrimination (PHD), that explicitly pre-trains a graph neural network at graph-level. PHD is designed as a simple binary classification task to discriminate whether two half-graphs come from the same source. Experiments demonstrate that the PHD is an effective pre-training strategy that offers comparable or superior performance on 13 graph classification tasks compared with state-of-the-art strategies, and achieves notable improvements when combined with node-level strategies. Moreover, the visualization of learned representation revealed that PHD strategy indeed empowers the model to learn graph-level knowledge like the molecular scaffold. These results have established PHD as a powerful and effective self-supervised learning strategy in graph-level representation learning.
APA, Harvard, Vancouver, ISO, and other styles
7

Hu, Yazhe, and Tomonari Furukawa. "A Self-Supervised Learning Technique for Road Defects Detection Based on Monocular Three-Dimensional Reconstruction." In ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2019. http://dx.doi.org/10.1115/detc2019-98135.

Full text
Abstract:
Abstract This paper presents a self-supervised learning technique for road surface defects detection using a monocular camera. The uniqueness of the proposed technique relies on its self-supervised learning structure which is achieved by combining physics-driven three-dimensional (3D) reconstruction with data-driven Convolutional Neural Network (CNN). Only images from one camera are needed as the inputs to the model without human labeling. The 3D point cloud are reconstructed from input images based on a near-planar road 3D reconstruction process to self-supervise the learning process. During testing, the network receives images and predicts the images as defect or non-defect. A refined class prediction is produced by combining the 3D road surface data with the network output when the belief of original network prediction is not strong enough to conclude the classification. Experiments are conducted on real road surface images to find the optimal parameters for this model. The testing results demonstrate the robustness and effectiveness of the proposed self-supervised road surface defects detection technique.
APA, Harvard, Vancouver, ISO, and other styles
8

Shao, Shuai, Lei Xing, Wei Yu, Rui Xu, Yan-Jiang Wang, and Bao-Di Liu. "SSDL: Self-Supervised Dictionary Learning." In 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2021. http://dx.doi.org/10.1109/icme51207.2021.9428336.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Kamimura, Ryotaro. "Self-enhancement learning: Self-supervised and target-creating learning." In 2009 International Joint Conference on Neural Networks (IJCNN 2009 - Atlanta). IEEE, 2009. http://dx.doi.org/10.1109/ijcnn.2009.5178677.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Cho, Hyunsoo, Jinseok Seol, and Sang-goo Lee. "Masked Contrastive Learning for Anomaly Detection." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/198.

Full text
Abstract:
Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have shown promising results. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additional labels. Among self-supervised learning tactics, contrastive learning is one specific framework showing pronounced results in various fields including anomaly detection. However, the primary objective of contrastive learning is to learn task-agnostic features without any labels, which is not entirely suited to discern anomalies. In this paper, we propose a task-specific variant of contrastive learning named masked contrastive learning, which is more befitted for anomaly detection. Moreover, we propose a new inference method dubbed self-ensemble inference that further boosts performance by leveraging the ability learned through auxiliary self-supervision tasks. By combining our models, we can outperform previous state-of-the-art methods by a significant margin on various benchmark datasets.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography