Auswahl der wissenschaftlichen Literatur zum Thema „Self-supervised learninig“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Self-supervised learninig" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Self-supervised learninig"

1

Zhao, Qingyu, Zixuan Liu, Ehsan Adeli und Kilian M. Pohl. „Longitudinal self-supervised learning“. Medical Image Analysis 71 (Juli 2021): 102051. http://dx.doi.org/10.1016/j.media.2021.102051.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Wang, Fei, und Changshui Zhang. „Robust self-tuning semi-supervised learning“. Neurocomputing 70, Nr. 16-18 (Oktober 2007): 2931–39. http://dx.doi.org/10.1016/j.neucom.2006.11.004.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Hrycej, Tomas. „Supporting supervised learning by self-organization“. Neurocomputing 4, Nr. 1-2 (Februar 1992): 17–30. http://dx.doi.org/10.1016/0925-2312(92)90040-v.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Shin, Sungho, Jongwon Kim, Yeonguk Yu, Seongju Lee und Kyoobin Lee. „Self-Supervised Transfer Learning from Natural Images for Sound Classification“. Applied Sciences 11, Nr. 7 (29.03.2021): 3043. http://dx.doi.org/10.3390/app11073043.

Der volle Inhalt der Quelle
Annotation:
We propose the implementation of transfer learning from natural images to audio-based images using self-supervised learning schemes. Through self-supervised learning, convolutional neural networks (CNNs) can learn the general representation of natural images without labels. In this study, a convolutional neural network was pre-trained with natural images (ImageNet) via self-supervised learning; subsequently, it was fine-tuned on the target audio samples. Pre-training with the self-supervised learning scheme significantly improved the sound classification performance when validated on the following benchmarks: ESC-50, UrbanSound8k, and GTZAN. The network pre-trained via self-supervised learning achieved a similar level of accuracy as those pre-trained using a supervised method that require labels. Therefore, we demonstrated that transfer learning from natural images contributes to improvements in audio-related tasks, and self-supervised learning with natural images is adequate for pre-training scheme in terms of simplicity and effectiveness.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Liu, Yuanyuan, und Qianqian Liu. „Research on Self-Supervised Comparative Learning for Computer Vision“. Journal of Electronic Research and Application 5, Nr. 3 (17.08.2021): 5–17. http://dx.doi.org/10.26689/jera.v5i3.2320.

Der volle Inhalt der Quelle
Annotation:
In recent years, self-supervised learning which does not require a large number of manual labels generate supervised signals through the data itself to attain the characterization learning of samples. Self-supervised learning solves the problem of learning semantic features from unlabeled data, and realizes pre-training of models in large data sets. Its significant advantages have been extensively studied by scholars in recent years. There are usually three types of self-supervised learning: “Generative, Contrastive, and Generative-Contrastive.” The model of the comparative learning method is relatively simple, and the performance of the current downstream task is comparable to that of the supervised learning method. Therefore, we propose a conceptual analysis framework: data augmentation pipeline, architectures, pretext tasks, comparison methods, semi-supervised fine-tuning. Based on this conceptual framework, we qualitatively analyze the existing comparative self-supervised learning methods for computer vision, and then further analyze its performance at different stages, and finally summarize the research status of self-supervised comparative learning methods in other fields.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Jaiswal, Ashish, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee und Fillia Makedon. „A Survey on Contrastive Self-Supervised Learning“. Technologies 9, Nr. 1 (28.12.2020): 2. http://dx.doi.org/10.3390/technologies9010002.

Der volle Inhalt der Quelle
Annotation:
Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

ITO, Seiya, Naoshi KANEKO und Kazuhiko SUMI. „Self-Supervised Learning for Multi-View Stereo“. Journal of the Japan Society for Precision Engineering 86, Nr. 12 (05.12.2020): 1042–50. http://dx.doi.org/10.2493/jjspe.86.1042.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Tenorio, M. F., und W. T. Lee. „Self-organizing network for optimum supervised learning“. IEEE Transactions on Neural Networks 1, Nr. 1 (März 1990): 100–110. http://dx.doi.org/10.1109/72.80209.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Florence, Peter, Lucas Manuelli und Russ Tedrake. „Self-Supervised Correspondence in Visuomotor Policy Learning“. IEEE Robotics and Automation Letters 5, Nr. 2 (April 2020): 492–99. http://dx.doi.org/10.1109/lra.2019.2956365.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Liu, Chicheng, Libin Song, Jiwen Zhang, Ken Chen und Jing Xu. „Self-Supervised Learning for Specified Latent Representation“. IEEE Transactions on Fuzzy Systems 28, Nr. 1 (Januar 2020): 47–59. http://dx.doi.org/10.1109/tfuzz.2019.2904237.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Self-supervised learninig"

1

Vančo, Timotej. „Self-supervised učení v aplikacích počítačového vidění“. Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2021. http://www.nusl.cz/ntk/nusl-442510.

Der volle Inhalt der Quelle
Annotation:
The aim of the diploma thesis is to make research of the self-supervised learning in computer vision applications, then to choose a suitable test task with an extensive data set, apply self-supervised methods and evaluate. The theoretical part of the work is focused on the description of methods in computer vision, a detailed description of neural and convolution networks and an extensive explanation and division of self-supervised methods. Conclusion of the theoretical part is devoted to practical applications of the Self-supervised methods in practice. The practical part of the diploma thesis deals with the description of the creation of code for working with datasets and the application of the SSL methods Rotation, SimCLR, MoCo and BYOL in the role of classification and semantic segmentation. Each application of the method is explained in detail and evaluated for various parameters on the large STL10 dataset. Subsequently, the success of the methods is evaluated for different datasets and the limiting conditions in the classification task are named. The practical part concludes with the application of SSL methods for pre-training the encoder in the application of semantic segmentation with the Cityscapes dataset.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Khan, Umair. „Self-supervised deep learning approaches to speaker recognition“. Doctoral thesis, Universitat Politècnica de Catalunya, 2021. http://hdl.handle.net/10803/671496.

Der volle Inhalt der Quelle
Annotation:
In speaker recognition, i-vectors have been the state-of-the-art unsupervised technique over the last few years, whereas x-vectors is becoming the state-of-the-art supervised technique, these days. Recent advances in Deep Learning (DL) approaches to speaker recognition have improved the performance but are constrained to the need of labels for the background data. In practice, labeled background data is not easily accessible, especially when large training data is required. In i-vector based speaker recognition, cosine and Probabilistic Linear Discriminant Analysis (PLDA) are the two basic scoring techniques. Cosine scoring is unsupervised whereas PLDA parameters are typically trained using speaker-labeled background data. This makes a big performance gap between these two scoring techniques. The question is: how to fill this performance gap without using speaker labels for the background data? In this thesis, the above mentioned problem has been addressed using DL approaches without using and/or limiting the use of labeled background data. Three DL based proposals have been made. In the first proposal, a Restricted Boltzmann Machine (RBM) vector representation of speech is proposed for the tasks of speaker clustering and tracking in TV broadcast shows. This representation is referred to as RBM vector. The experiments on AGORA database show that in speaker clustering the RBM vectors gain a relative improvement of 12% in terms of Equal Impurity (EI). For speaker tracking task RBM vectors are used only in the speaker identification part, where the relative improvement in terms of Equal Error Rate (EER) is 11% and 7% using cosine and PLDA scoring, respectively. In the second proposal, DL approaches are proposed in order to increase the discriminative power of i-vectors in speaker verification. We have proposed the use of autoencoder in several ways. Firstly, an autoencoder will be used as a pre-training for a Deep Neural Network (DNN) using a large amount of unlabeled background data. Then, a DNN classifier will be trained using relatively small labeled data. Secondly, an autoencoder will be trained to transform i-vectors into a new representation to increase their discriminative power. The training will be carried out based on the nearest neighbor i-vectors which will be chosen in an unsupervised manner. The evaluation was performed on VoxCeleb-1 database. The results show that using the first system, we gain a relative improvement of 21% in terms of EER, over i-vector/PLDA. Whereas, using the second system, a relative improvement of 42% is gained. If we use the background data in the testing part, a relative improvement of 53% is gained. In the third proposal, we will train a self-supervised end-to-end speaker verification system. The idea is to utilize impostor samples along with the nearest neighbor samples to make client/impostor pairs in an unsupervised manner. The architecture will be based on a Convolutional Neural Network (CNN) encoder, trained as a siamese network with two branch networks. Another network with three branches will also be trained using triplet loss, in order to extract unsupervised speaker embeddings. The experimental results show that both the end-to-end system and the speaker embeddings, despite being unsupervised, show a comparable performance to the supervised baseline. Moreover, their score combination can further improve the performance. The proposed approaches for speaker verification have respective pros and cons. The best result was obtained using the nearest neighbor autoencoder with a disadvantage of relying on background i-vectors in the testing. On the contrary, the autoencoder pre-training for DNN is not bound by this factor but is a semi-supervised approach. The third proposal is free from both these constraints and performs pretty reasonably. It is a self-supervised approach and it does not require the background i-vectors in the testing phase.
Los avances recientes en Deep Learning (DL) para el reconocimiento del hablante están mejorado el rendimiento de los sistemas tradicionales basados en i-vectors. En el reconocimiento de locutor basado en i-vectors, la distancia coseno y el análisis discriminante lineal probabilístico (PLDA) son las dos técnicas más usadas de puntuación. La primera no es supervisada, pero la segunda necesita datos etiquetados por el hablante, que no son siempre fácilmente accesibles en la práctica. Esto crea una gran brecha de rendimiento entre estas dos técnicas de puntuación. La pregunta es: ¿cómo llenar esta brecha de rendimiento sin usar etiquetas del hablante en los datos de background? En esta tesis, el problema anterior se ha abordado utilizando técnicas de DL sin utilizar y/o limitar el uso de datos etiquetados. Se han realizado tres propuestas basadas en DL. En la primera, se propone una representación vectorial de voz basada en la máquina de Boltzmann restringida (RBM) para las tareas de agrupación de hablantes y seguimiento de hablantes en programas de televisión. Los experimentos en la base de datos AGORA, muestran que en agrupación de hablantes los vectores RBM suponen una mejora relativa del 12%. Y, por otro lado, en seguimiento del hablante, los vectores RBM,utilizados solo en la etapa de identificación del hablante, muestran una mejora relativa del 11% (coseno) y 7% (PLDA). En la segunda, se utiliza DL para aumentar el poder discriminativo de los i-vectors en la verificación del hablante. Se ha propuesto el uso del autocodificador de varias formas. En primer lugar, se utiliza un autocodificador como preentrenamiento de una red neuronal profunda (DNN) utilizando una gran cantidad de datos de background sin etiquetar, para posteriormente entrenar un clasificador DNN utilizando un conjunto reducido de datos etiquetados. En segundo lugar, se entrena un autocodificador para transformar i-vectors en una nueva representación para aumentar el poder discriminativo de los i-vectors. El entrenamiento se lleva a cabo en base a los i-vectors vecinos más cercanos, que se eligen de forma no supervisada. La evaluación se ha realizado con la base de datos VoxCeleb-1. Los resultados muestran que usando el primer sistema obtenemos una mejora relativa del 21% sobre i-vectors, mientras que usando el segundo sistema, se obtiene una mejora relativa del 42%. Además, si utilizamos los datos de background en la etapa de prueba, se obtiene una mejora relativa del 53%. En la tercera, entrenamos un sistema auto-supervisado de verificación de locutor de principio a fin. Utilizamos impostores junto con los vecinos más cercanos para formar pares cliente/impostor sin supervisión. La arquitectura se basa en un codificador de red neuronal convolucional (CNN) que se entrena como una red siamesa con dos ramas. Además, se entrena otra red con tres ramas utilizando la función de pérdida triplete para extraer embeddings de locutores. Los resultados muestran que tanto el sistema de principio a fin como los embeddings de locutores, a pesar de no estar supervisados, tienen un rendimiento comparable a una referencia supervisada. Cada uno de los enfoques propuestos tienen sus pros y sus contras. El mejor resultado se obtuvo utilizando el autocodificador con el vecino más cercano, con la desventaja de que necesita los i-vectors de background en el test. El uso del preentrenamiento del autocodificador para DNN no tiene este problema, pero es un enfoque semi-supervisado, es decir, requiere etiquetas de hablantes solo de una parte pequeña de los datos de background. La tercera propuesta no tienes estas dos limitaciones y funciona de manera razonable. Es un en
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Korecki, John Nicholas. „Semi-Supervised Self-Learning on Imbalanced Data Sets“. Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1686.

Der volle Inhalt der Quelle
Annotation:
Semi-supervised self-learning algorithms have been shown to improve classifier accuracy under a variety of conditions. In this thesis, semi-supervised self-learning using ensembles of random forests and fuzzy c-means clustering similarity was applied to three data sets to show where improvement is possible over random forests alone. Two of the data sets are emulations of large simulations in which the data may be distributed. Additionally, the ratio of majority to minority class examples in the training set was altered to examine the effect of training set bias on performance when applying the semi-supervised algorithm.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Govindarajan, Hariprasath. „Self-Supervised Representation Learning for Content Based Image Retrieval“. Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166223.

Der volle Inhalt der Quelle
Annotation:
Automotive technologies and fully autonomous driving have seen a tremendous growth in recent times and have benefitted from extensive deep learning research. State-of-the-art deep learning methods are largely supervised and require labelled data for training. However, the annotation process for image data is time-consuming and costly in terms of human efforts. It is of interest to find informative samples for labelling by Content Based Image Retrieval (CBIR). Generally, a CBIR method takes a query image as input and returns a set of images that are semantically similar to the query image. The image retrieval is achieved by transforming images to feature representations in a latent space, where it is possible to reason about image similarity in terms of image content. In this thesis, a self-supervised method is developed to learn feature representations of road scenes images. The self-supervised method learns feature representations for images by adapting intermediate convolutional features from an existing deep Convolutional Neural Network (CNN). A contrastive approach based on Noise Contrastive Estimation (NCE) is used to train the feature learning model. For complex images like road scenes where mutiple image aspects can occur simultaneously, it is important to embed all the salient image aspects in the feature representation. To achieve this, the output feature representation is obtained as an ensemble of feature embeddings which are learned by focusing on different image aspects. An attention mechanism is incorporated to encourage each ensemble member to focus on different image aspects. For comparison, a self-supervised model without attention is considered and a simple dimensionality reduction approach using SVD is treated as the baseline. The methods are evaluated on nine different evaluation datasets using CBIR performance metrics. The datasets correspond to different image aspects and concern the images at different spatial levels - global, semi-global and local. The feature representations learned by self-supervised methods are shown to perform better than the SVD approach. Taking into account that no labelled data is required for training, learning representations for road scenes images using self-supervised methods appear to be a promising direction. Usage of multiple query images to emphasize a query intention is investigated and a clear improvement in CBIR performance is observed. It is inconclusive whether the addition of an attentive mechanism impacts CBIR performance. The attention method shows some positive signs based on qualitative analysis and also performs better than other methods for one of the evaluation datasets containing a local aspect. This method for learning feature representations is promising but requires further research involving more diverse and complex image aspects.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Zangeneh, Kamali Fereidoon. „Self-supervised learning of camera egomotion using epipolar geometry“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286286.

Der volle Inhalt der Quelle
Annotation:
Visual odometry is one of the prevalent techniques for the positioning of autonomous agents equipped with cameras. Several recent works in this field have in various ways attempted to exploit the capabilities of deep neural networks to improve the performance of visual odometry solutions. One of such approaches is using an end-to-end learning-based solution to infer the egomotion of the camera from a sequence of input images. The state of the art end-to-end solutions employ a common self-supervised training strategy that minimises a notion of photometric error formed by the view synthesis of the input images. As this error is a function of the predicted egomotion, its minimisation corresponds to the learning of egomotion estimation by the network. However, this also requires the depth information of the images, for which an additional depth estimation network is introduced in training. This implies that for end-to-end learning of camera egomotion, a set of parameters are required to be learned, which are not used in inference. In this work, we propose a novel learning strategy using epipolar geometry, which does not rely on depth estimations. Empirical evaluation of our method demonstrates its comparable performance to the baseline work that relies on explicit depth estimations for training.
Visuell odometri är en av de vanligast förekommande teknikerna för positionering av autonoma agenter utrustade med kameror. Flera senare arbeten inom detta område har på olika sätt försökt utnyttja kapaciteten hos djupa neurala nätverk för att förbättra prestandan hos lösningar baserade på visuell odometri. Ett av dessa tillvägagångssätt består i att använda en inlärningsbaserad lösning för att härleda kamerans rörelse utifrån en sekvens av bilder. Gemensamt för de flesta senare lösningar är en självövervakande träningsstrategi som minimerar det uppfattade fotometriska fel som uppskattas genom att syntetisera synvinkeln utifrån givna bildsekvenser. Eftersom detta fel är en funktion av den estimerade kamerarörelsen motsvarar minimering av felet att nätverket lär sig uppskatta kamerarörelsen. Denna inlärning kräver dock även information om djupet i bilderna, vilket fås genom att introducera ett nätverk specifikt för estimering av djup. Detta innebär att för uppskattning av kamerans rörelse krävs inlärning av ytterligare en uppsättning parametrar vilka inte används i den slutgiltiga uppskattningen. I detta arbete föreslår vi en ny inlärningsstrategi baserad på epipolär geometri, vilket inte beror på djupskattningar. Empirisk utvärdering av vår metod visar att dess resultat är jämförbara med tidigare metoder som använder explicita djupskattningar för träning.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Sharma, Vivek [Verfasser], und R. [Akademischer Betreuer] Stiefelhagen. „Self-supervised Face Representation Learning / Vivek Sharma ; Betreuer: R. Stiefelhagen“. Karlsruhe : KIT-Bibliothek, 2020. http://d-nb.info/1212512545/34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Coen, Michael Harlan. „Multimodal dynamics : self-supervised learning in perceptual and motor systems“. Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/34022.

Der volle Inhalt der Quelle
Annotation:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references (leaves 178-192).
This thesis presents a self-supervised framework for perceptual and motor learning based upon correlations in different sensory modalities. The brain and cognitive sciences have gathered an enormous body of neurological and phenomenological evidence in the past half century demonstrating the extraordinary degree of interaction between sensory modalities during the course of ordinary perception. We develop a framework for creating artificial perceptual systems that draws on these findings, where the primary architectural motif is the cross-modal transmission of perceptual information to enhance each sensory channel individually. We present self-supervised algorithms for learning perceptual grounding, intersensory influence, and sensorymotor coordination, which derive training signals from internal cross-modal correlations rather than from external supervision. Our goal is to create systems that develop by interacting with the world around them, inspired by development in animals. We demonstrate this framework with: (1) a system that learns the number and structure of vowels in American English by simultaneously watching and listening to someone speak. The system then cross-modally clusters the correlated auditory and visual data.
(cont.) It has no advance linguistic knowledge and receives no information outside of its sensory channels. This work is the first unsupervised acquisition of phonetic structure of which we are aware, outside of that done by human infants. (2) a system that learns to sing like a zebra finch, following the developmental stages of a juvenile zebra finch. It first learns the song of an adult male and then listens to its own initially nascent attempts at mimicry through an articulatory synthesizer. In acquiring the birdsong to which it was initially exposed, this system demonstrates self-supervised sensorimotor learning. It also demonstrates afferent and efferent equivalence - the system learns motor maps with the same computational framework used for learning sensory maps.
by Michael Harlan Coen.
Ph.D.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Nyströmer, Carl. „Musical Instrument Activity Detection using Self-Supervised Learning and Domain Adaptation“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280810.

Der volle Inhalt der Quelle
Annotation:
With the ever growing media and music catalogs, tools that search and navigate this data are important. For more complex search queries, meta-data is needed, but to manually label the vast amounts of new content is impossible. In this thesis, automatic labeling of musical instrument activities in song mixes is investigated, with a focus on ways to alleviate the lack of annotated data for instrument activity detection models. Two methods for alleviating the problem of small amounts of data are proposed and evaluated. Firstly, a self-supervised approach based on automatic labeling and mixing of randomized instrument stems is investigated. Secondly, a domain-adaptation approach that trains models on sampled MIDI files for instrument activity detection on recorded music is explored. The self-supervised approach yields better results compared to the baseline and points to the fact that deep learning models can learn instrument activity detection without an intrinsic musical structure in the audio mix. The domain-adaptation models trained solely on sampled MIDI files performed worse than the baseline, however using MIDI data in conjunction with recorded music boosted the performance. A hybrid model combining both self-supervised learning and domain adaptation by using both sampled MIDI data and recorded music produced the best results overall.
I och med de ständigt växande media- och musikkatalogerna krävs verktyg för att söka och navigera i dessa. För mer komplexa sökförfrågningar så behövs det metadata, men att manuellt annotera de enorma mängderna av ny data är omöjligt. I denna uppsats undersöks automatisk annotering utav instrumentsaktivitet inom musik, med ett fokus på bristen av annoterad data för modellerna för instrumentaktivitetsigenkänning. Två metoder för att komma runt bristen på data föreslås och undersöks. Den första metoden bygger på självövervakad inlärning baserad på automatisk annotering och slumpartad mixning av olika instrumentspår. Den andra metoden använder domänadaption genom att träna modeller på samplade MIDI-filer för detektering av instrument i inspelad musik. Metoden med självövervakning gav bättre resultat än baseline och pekar på att djupinlärningsmodeller kan lära sig instrumentigenkänning trots att ljudmixarna saknar musikalisk struktur. Domänadaptionsmodellerna som endast var tränade på samplad MIDI-data presterade sämre än baseline, men att använda MIDI-data tillsammans med data från inspelad musik gav förbättrade resultat. En hybridmodell som kombinerade både självövervakad inlärning och domänadaption genom att använda både samplad MIDI-data och inspelad musik gav de bästa resultaten totalt.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Nett, Ryan. „Dataset and Evaluation of Self-Supervised Learning for Panoramic Depth Estimation“. DigitalCommons@CalPoly, 2020. https://digitalcommons.calpoly.edu/theses/2234.

Der volle Inhalt der Quelle
Annotation:
Depth detection is a very common computer vision problem. It shows up primarily in robotics, automation, or 3D visualization domains, as it is essential for converting images to point clouds. One of the poster child applications is self driving cars. Currently, the best methods for depth detection are either very expensive, like LIDAR, or require precise calibration, like stereo cameras. These costs have given rise to attempts to detect depth from a monocular camera (a single camera). While this is possible, it is harder than LIDAR or stereo methods since depth can't be measured from monocular images, it has to be inferred. A good example is covering one eye: you still have some idea how far away things are, but it's not exact. Neural networks are a natural fit for this. Here, we build on previous neural network methods by applying a recent state of the art model to panoramic images in addition to pinhole ones and performing a comparative evaluation. First, we create a simulated depth detection dataset that lends itself to panoramic comparisons and contains pre-made cylindrical and spherical panoramas. We then modify monodepth2 to support cylindrical and cubemap panoramas, incorporating current best practices for depth detection on those panorama types, and evaluate its performance for each type of image using our dataset. We also consider the resources used in training and other qualitative factors.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Baleia, José Rodrigo Ferreira. „Haptic robot-environment interaction for self-supervised learning in ground mobility“. Master's thesis, Faculdade de Ciências e Tecnologia, 2014. http://hdl.handle.net/10362/12475.

Der volle Inhalt der Quelle
Annotation:
Dissertação para obtenção do Grau de Mestre em Engenharia Eletrotécnica e de Computadores
This dissertation presents a system for haptic interaction and self-supervised learning mechanisms to ascertain navigation affordances from depth cues. A simple pan-tilt telescopic arm and a structured light sensor, both fitted to the robot’s body frame, provide the required haptic and depth sensory feedback. The system aims at incrementally develop the ability to assess the cost of navigating in natural environments. For this purpose the robot learns a mapping between the appearance of objects, given sensory data provided by the sensor, and their bendability, perceived by the pan-tilt telescopic arm. The object descriptor, representing the object in memory and used for comparisons with other objects, is rich for a robust comparison and simple enough to allow for fast computations. The output of the memory learning mechanism allied with the haptic interaction point evaluation prioritize interaction points to increase the confidence on the interaction and correctly identifying obstacles, reducing the risk of the robot getting stuck or damaged. If the system concludes that the object is traversable, the environment change detection system allows the robot to overcome it. A set of field trials show the ability of the robot to progressively learn which elements of environment are traversable.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bücher zum Thema "Self-supervised learninig"

1

Munro, Paul. Self-supervised learning of concepts by single units and "weakly local" representations. Pittsburgh, PA: School of Library and Information Science, University of Pittsburgh, 1988.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Munro, Paul. Self-supervised learning of concepts by single units and "weakly local" representations. School of Library and Information Science, University of Pittsburgh, 1988.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Self-supervised learninig"

1

Nedelkoski, Sasho, Jasmin Bogatinovski, Alexander Acker, Jorge Cardoso und Odej Kao. „Self-supervised Log Parsing“. In Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track, 122–38. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-67667-4_8.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Jawed, Shayan, Josif Grabocka und Lars Schmidt-Thieme. „Self-supervised Learning for Semi-supervised Time Series Classification“. In Advances in Knowledge Discovery and Data Mining, 499–511. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-47426-3_39.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Jamaludin, Amir, Timor Kadir und Andrew Zisserman. „Self-supervised Learning for Spinal MRIs“. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, 294–302. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-67558-9_34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Liu, Fengbei, Yu Tian, Filipe R. Cordeiro, Vasileios Belagiannis, Ian Reid und Gustavo Carneiro. „Self-supervised Mean Teacher for Semi-supervised Chest X-Ray Classification“. In Machine Learning in Medical Imaging, 426–36. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87589-3_44.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Si, Chenyang, Xuecheng Nie, Wei Wang, Liang Wang, Tieniu Tan und Jiashi Feng. „Adversarial Self-supervised Learning for Semi-supervised 3D Action Recognition“. In Computer Vision – ECCV 2020, 35–51. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58571-6_3.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Zhang, Ruifei, Sishuo Liu, Yizhou Yu und Guanbin Li. „Self-supervised Correction Learning for Semi-supervised Biomedical Image Segmentation“. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, 134–44. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87196-3_13.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Valvano, Gabriele, Andrea Leo und Sotirios A. Tsaftaris. „Self-supervised Multi-scale Consistency for Weakly Supervised Segmentation Learning“. In Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health, 14–24. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87722-4_2.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Feng, Ruibin, Zongwei Zhou, Michael B. Gotway und Jianming Liang. „Parts2Whole: Self-supervised Contrastive Learning via Reconstruction“. In Domain Adaptation and Representation Transfer, and Distributed and Collaborative Learning, 85–95. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60548-3_9.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Cervera, Enrique, und Angel P. Pobil. „Multiple self-organizing maps for supervised learning“. In Lecture Notes in Computer Science, 345–52. Berlin, Heidelberg: Springer Berlin Heidelberg, 1995. http://dx.doi.org/10.1007/3-540-59497-3_195.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Karlos, Stamatis, Nikos Fazakis, Sotiris Kotsiantis und Kyriakos Sgarbas. „Self-Train LogitBoost for Semi-supervised Learning“. In Engineering Applications of Neural Networks, 139–48. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23983-5_14.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Self-supervised learninig"

1

An, Yuexuan, Hui Xue, Xingyu Zhao und Lu Zhang. „Conditional Self-Supervised Learning for Few-Shot Classification“. In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/295.

Der volle Inhalt der Quelle
Annotation:
How to learn a transferable feature representation from limited examples is a key challenge for few-shot classification. Self-supervision as an auxiliary task to the main supervised few-shot task is considered to be a conceivable way to solve the problem since self-supervision can provide additional structural information easily ignored by the main task. However, learning a good representation by traditional self-supervised methods is usually dependent on large training samples. In few-shot scenarios, due to the lack of sufficient samples, these self-supervised methods might learn a biased representation, which more likely leads to the wrong guidance for the main tasks and finally causes the performance degradation. In this paper, we propose conditional self-supervised learning (CSS) to use auxiliary information to guide the representation learning of self-supervised tasks. Specifically, CSS leverages supervised information as prior knowledge to shape and improve the learning feature manifold of self-supervision without auxiliary unlabeled data, so as to reduce representation bias and mine more effective semantic information. Moreover, CSS exploits more meaningful information through supervised and the improved self-supervised learning respectively and integrates the information into a unified distribution, which can further enrich and broaden the original representation. Extensive experiments demonstrate that our proposed method without any fine-tuning can achieve a significant accuracy improvement on the few-shot classification scenarios compared to the state-of-the-art few-shot learning methods.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Beyer, Lucas, Xiaohua Zhai, Avital Oliver und Alexander Kolesnikov. „S4L: Self-Supervised Semi-Supervised Learning“. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019. http://dx.doi.org/10.1109/iccv.2019.00156.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Basaj, Dominika, Witold Oleszkiewicz, Igor Sieradzki, Michał Górszczak, Barbara Rychalska, Tomasz Trzcinski und Bartosz Zieliński. „Explaining Self-Supervised Image Representations with Visual Probing“. In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/82.

Der volle Inhalt der Quelle
Annotation:
Recently introduced self-supervised methods for image representation learning provide on par or superior results to their fully supervised competitors, yet the corresponding efforts to explain the self-supervised approaches lag behind. Motivated by this observation, we introduce a novel visual probing framework for explaining the self-supervised models by leveraging probing tasks employed previously in natural language processing. The probing tasks require knowledge about semantic relationships between image parts. Hence, we propose a systematic approach to obtain analogs of natural language in vision, such as visual words, context, and taxonomy. We show the effectiveness and applicability of those analogs in the context of explaining self-supervised representations. Our key findings emphasize that relations between language and vision can serve as an effective yet intuitive tool for discovering how machine learning models work, independently of data modality. Our work opens a plethora of research pathways towards more explainable and transparent AI.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Song, Jinwoo, und Young B. Moon. „Infill Defective Detection System Augmented by Semi-Supervised Learning“. In ASME 2020 International Mechanical Engineering Congress and Exposition. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/imece2020-23249.

Der volle Inhalt der Quelle
Annotation:
Abstract In an effort to identify cyber-attacks on infill structures, detection systems based on supervised learning have been attempted in Additive Manufacturing (AM) security investigations. However, supervised learning requires a myriad of training data sets to achieve acceptable detection accuracy. Besides, since it is impossible to train for unprecedented defective types, the detection systems cannot guarantee robustness against unforeseen attacks. To overcome such disadvantages of supervised learning, This paper presents infill defective detection system (IDDS) augmented by semi-supervised learning. Semi-supervised learning allows classifying a sheer volume of unlabeled data sets by training a comparably small number of labeled data sets. Additionally, IDDS exploits self-training to increase the robustness against various defective types that are not pre-trained. IDDS consists of the feature extraction, pre-training, self-training. To validate the usefulness of IDDS, five defective types were designed and tested with IDDS, which was trained by only normal labeled data sets. The results are compared with the basis accuracy from the perceptron network model with supervised learning.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Wu, Jiawei, Xin Wang und William Yang Wang. „Self-Supervised Dialogue Learning“. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1375.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Li, Pengyong, Jun Wang, Ziliang Li, Yixuan Qiao, Xianggen Liu, Fei Ma, Peng Gao, Sen Song und Guotong Xie. „Pairwise Half-graph Discrimination: A Simple Graph-level Self-supervised Strategy for Pre-training Graph Neural Networks“. In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/371.

Der volle Inhalt der Quelle
Annotation:
Self-supervised learning has gradually emerged as a powerful technique for graph representation learning. However, transferable, generalizable, and robust representation learning on graph data still remains a challenge for pre-training graph neural networks. In this paper, we propose a simple and effective self-supervised pre-training strategy, named Pairwise Half-graph Discrimination (PHD), that explicitly pre-trains a graph neural network at graph-level. PHD is designed as a simple binary classification task to discriminate whether two half-graphs come from the same source. Experiments demonstrate that the PHD is an effective pre-training strategy that offers comparable or superior performance on 13 graph classification tasks compared with state-of-the-art strategies, and achieves notable improvements when combined with node-level strategies. Moreover, the visualization of learned representation revealed that PHD strategy indeed empowers the model to learn graph-level knowledge like the molecular scaffold. These results have established PHD as a powerful and effective self-supervised learning strategy in graph-level representation learning.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Hu, Yazhe, und Tomonari Furukawa. „A Self-Supervised Learning Technique for Road Defects Detection Based on Monocular Three-Dimensional Reconstruction“. In ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2019. http://dx.doi.org/10.1115/detc2019-98135.

Der volle Inhalt der Quelle
Annotation:
Abstract This paper presents a self-supervised learning technique for road surface defects detection using a monocular camera. The uniqueness of the proposed technique relies on its self-supervised learning structure which is achieved by combining physics-driven three-dimensional (3D) reconstruction with data-driven Convolutional Neural Network (CNN). Only images from one camera are needed as the inputs to the model without human labeling. The 3D point cloud are reconstructed from input images based on a near-planar road 3D reconstruction process to self-supervise the learning process. During testing, the network receives images and predicts the images as defect or non-defect. A refined class prediction is produced by combining the 3D road surface data with the network output when the belief of original network prediction is not strong enough to conclude the classification. Experiments are conducted on real road surface images to find the optimal parameters for this model. The testing results demonstrate the robustness and effectiveness of the proposed self-supervised road surface defects detection technique.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Shao, Shuai, Lei Xing, Wei Yu, Rui Xu, Yan-Jiang Wang und Bao-Di Liu. „SSDL: Self-Supervised Dictionary Learning“. In 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2021. http://dx.doi.org/10.1109/icme51207.2021.9428336.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Kamimura, Ryotaro. „Self-enhancement learning: Self-supervised and target-creating learning“. In 2009 International Joint Conference on Neural Networks (IJCNN 2009 - Atlanta). IEEE, 2009. http://dx.doi.org/10.1109/ijcnn.2009.5178677.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Cho, Hyunsoo, Jinseok Seol und Sang-goo Lee. „Masked Contrastive Learning for Anomaly Detection“. In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/198.

Der volle Inhalt der Quelle
Annotation:
Detecting anomalies is one fundamental aspect of a safety-critical software system, however, it remains a long-standing problem. Numerous branches of works have been proposed to alleviate the complication and have shown promising results. In particular, self-supervised learning based methods are spurring interest due to their capability of learning diverse representations without additional labels. Among self-supervised learning tactics, contrastive learning is one specific framework showing pronounced results in various fields including anomaly detection. However, the primary objective of contrastive learning is to learn task-agnostic features without any labels, which is not entirely suited to discern anomalies. In this paper, we propose a task-specific variant of contrastive learning named masked contrastive learning, which is more befitted for anomaly detection. Moreover, we propose a new inference method dubbed self-ensemble inference that further boosts performance by leveraging the ability learned through auxiliary self-supervision tasks. By combining our models, we can outperform previous state-of-the-art methods by a significant margin on various benchmark datasets.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie