Academic literature on the topic 'Interpolation-Based data augmentation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Interpolation-Based data augmentation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Interpolation-Based data augmentation"

1

Oh, Cheolhwan, Seungmin Han, and Jongpil Jeong. "Time-Series Data Augmentation based on Interpolation." Procedia Computer Science 175 (2020): 64–71. http://dx.doi.org/10.1016/j.procs.2020.07.012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Li, Yuliang, Xiaolan Wang, Zhengjie Miao, and Wang-Chiew Tan. "Data augmentation for ML-driven data preparation and integration." Proceedings of the VLDB Endowment 14, no. 12 (July 2021): 3182–85. http://dx.doi.org/10.14778/3476311.3476403.

Full text
Abstract:
In recent years, we have witnessed the development of novel data augmentation (DA) techniques for creating additional training data needed by machine learning based solutions. In this tutorial, we will provide a comprehensive overview of techniques developed by the data management community for data preparation and data integration. In addition to surveying task-specific DA operators that leverage rules, transformations, and external knowledge for creating additional training data, we also explore the advanced DA techniques such as interpolation, conditional generation, and DA policy learning. Finally, we describe the connection between DA and other machine learning paradigms such as active learning, pre-training, and weakly-supervised learning. We hope that this discussion can shed light on future research directions for a holistic data augmentation framework for high-quality dataset creation.
APA, Harvard, Vancouver, ISO, and other styles
3

Huang, Chenhui, and Akinobu Shibuya. "High Accuracy Geochemical Map Generation Method by a Spatial Autocorrelation-Based Mixture Interpolation Using Remote Sensing Data." Remote Sensing 12, no. 12 (June 21, 2020): 1991. http://dx.doi.org/10.3390/rs12121991.

Full text
Abstract:
Generating a high-resolution whole-pixel geochemical contents map from a map with sparse distribution is a regression problem. Currently, multivariate prediction models like machine learning (ML) are constructed to raise the geoscience mapping resolution. Methods coupling the spatial autocorrelation into the ML model have been proposed for raising ML prediction accuracy. Previously proposed methods are needed for complicated modification in ML models. In this research, we propose a new algorithm called spatial autocorrelation-based mixture interpolation (SABAMIN), with which it is easier to merge spatial autocorrelation into a ML model only using a data augmentation strategy. To test the feasibility of this concept, remote sensing data including those from the advanced spaceborne thermal emission and reflection radiometer (ASTER), digital elevation model (DEM), and geophysics (geomagnetic) data were used for the feasibility study, along with copper geochemical and copper mine data from Arizona, USA. We explained why spatial information can be coupled into an ML model only by data augmentation, and introduced how to operate data augmentation in our case. Four tests—(i) cross-validation of measured data, (ii) the blind test, (iii) the temporal stability test, and (iv) the predictor importance test—were conducted to evaluate the model. As the results, the model’s accuracy was improved compared with a traditional ML model, and the reliability of the algorithm was confirmed. In summary, combining the univariate interpolation method with multivariate prediction with data augmentation proved effective for geological studies.
APA, Harvard, Vancouver, ISO, and other styles
4

Tsourtis, Anastasios, Georgios Papoutsoglou, and Yannis Pantazis. "GAN-Based Training of Semi-Interpretable Generators for Biological Data Interpolation and Augmentation." Applied Sciences 12, no. 11 (May 27, 2022): 5434. http://dx.doi.org/10.3390/app12115434.

Full text
Abstract:
Single-cell measurements incorporate invaluable information regarding the state of each cell and its underlying regulatory mechanisms. The popularity and use of single-cell measurements are constantly growing. Despite the typically large number of collected data, the under-representation of important cell (sub-)populations negatively affects down-stream analysis and its robustness. Therefore, the enrichment of biological datasets with samples that belong to a rare state or manifold is overall advantageous. In this work, we train families of generative models via the minimization of Rényi divergence resulting in an adversarial training framework. Apart from the standard neural network-based models, we propose families of semi-interpretable generative models. The proposed models are further tailored to generate realistic gene expression measurements, whose characteristics include zero-inflation and sparsity, without the need of any data pre-processing. Explicit factors of the data such as measurement time, state or cluster are taken into account by our generative models as conditional variables. We train the proposed conditional models and compare them against the state-of-the-art on a range of synthetic and real datasets and demonstrate their ability to accurately perform data interpolation and augmentation.
APA, Harvard, Vancouver, ISO, and other styles
5

Bi, Xiao-ying, Bo Li, Wen-long Lu, and Xin-zhi Zhou. "Daily runoff forecasting based on data-augmented neural network model." Journal of Hydroinformatics 22, no. 4 (May 16, 2020): 900–915. http://dx.doi.org/10.2166/hydro.2020.017.

Full text
Abstract:
Abstract Accurate daily runoff prediction plays an important role in the management and utilization of water resources. In order to improve the accuracy of prediction, this paper proposes a deep neural network (CAGANet) composed of a convolutional layer, an attention mechanism, a gated recurrent unit (GRU) neural network, and an autoregressive (AR) model. Given that the daily runoff sequence is abrupt and unstable, it is difficult for a single model and combined model to obtain high-precision daily runoff predictions directly. Therefore, this paper uses a linear interpolation method to enhance the stability of hydrological data and apply the augmented data to the CAGANet model, the support vector machine (SVM) model, the long short-term memory (LSTM) neural network and the attention-mechanism-based LSTM model (AM-LSTM). The comparison results show that among the four models based on data augmentation, the CAGANet model proposed in this paper has the best prediction accuracy. Its Nash–Sutcliffe efficiency can reach 0.993. Therefore, the CAGANet model based on data augmentation is a feasible daily runoff forecasting scheme.
APA, Harvard, Vancouver, ISO, and other styles
6

de Rojas, Ana Lazcano. "Data augmentation in economic time series: Behavior and improvements in predictions." AIMS Mathematics 8, no. 10 (2023): 24528–44. http://dx.doi.org/10.3934/math.20231251.

Full text
Abstract:
<abstract> <p>The performance of neural networks and statistical models in time series prediction is conditioned by the amount of data available. The lack of observations is one of the main factors influencing the representativeness of the underlying patterns and trends. Using data augmentation techniques based on classical statistical techniques and neural networks, it is possible to generate additional observations and improve the accuracy of the predictions. The particular characteristics of economic time series make it necessary that data augmentation techniques do not significantly influence these characteristics, this fact would alter the quality of the details in the study. This paper analyzes the performance obtained by two data augmentation techniques applied to a time series and finally processed by an ARIMA model and a neural network model to make predictions. The results show a significant improvement in the predictions by the time series augmented by traditional interpolation techniques, obtaining a better fit and correlation with the original series.</p> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
7

Xie, Xiangjin, Li Yangning, Wang Chen, Kai Ouyang, Zuotong Xie, and Hai-Tao Zheng. "Global Mixup: Eliminating Ambiguity with Clustering." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (June 26, 2023): 13798–806. http://dx.doi.org/10.1609/aaai.v37i11.26616.

Full text
Abstract:
Data augmentation with Mixup has been proven an effective method to regularize the current deep neural networks. Mixup generates virtual samples and corresponding labels simultaneously by linear interpolation. However, the one-stage generation paradigm and the use of linear interpolation have two defects: (1) The label of the generated sample is simply combined from the labels of the original sample pairs without reasonable judgment, resulting in ambiguous labels. (2) Linear combination significantly restricts the sampling space for generating samples. To address these issues, we propose a novel and effective augmentation method, Global Mixup, based on global clustering relationships. Specifically, we transform the previous one-stage augmentation process into two-stage by decoupling the process of generating virtual samples from the labeling. And for the labels of the generated samples, relabeling is performed based on clustering by calculating the global relationships of the generated samples. Furthermore, we are no longer restricted to linear relationships, which allows us to generate more reliable virtual samples in a larger sampling space. Extensive experiments for CNN, LSTM, and BERT on five tasks show that Global Mixup outperforms previous baselines. Further experiments also demonstrate the advantage of Global Mixup in low-resource scenarios.
APA, Harvard, Vancouver, ISO, and other styles
8

Guo, Hongyu. "Nonlinear Mixup: Out-Of-Manifold Data Augmentation for Text Classification." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 4044–51. http://dx.doi.org/10.1609/aaai.v34i04.5822.

Full text
Abstract:
Data augmentation with Mixup (Zhang et al. 2018) has shown to be an effective model regularizer for current art deep classification networks. It generates out-of-manifold samples through linearly interpolating inputs and their corresponding labels of random sample pairs. Despite its great successes, Mixup requires convex combination of the inputs as well as the modeling targets of a sample pair, thus significantly limits the space of its synthetic samples and consequently its regularization effect. To cope with this limitation, we propose “nonlinear Mixup”. Unlike Mixup where the input and label pairs share the same, linear, scalar mixing policy, our approach embraces nonlinear interpolation policy for both the input and label pairs, where the mixing policy for the labels is adaptively learned based on the mixed input. Experiments on benchmark sentence classification datasets indicate that our approach significantly improves upon Mixup. Our empirical studies also show that the out-of-manifold samples generated by our strategy encourage training samples in each class to form a tight representation cluster that is far from others.
APA, Harvard, Vancouver, ISO, and other styles
9

Lim, Seong-Su, and Oh-Wook Kwon. "FrameAugment: A Simple Data Augmentation Method for Encoder–Decoder Speech Recognition." Applied Sciences 12, no. 15 (July 28, 2022): 7619. http://dx.doi.org/10.3390/app12157619.

Full text
Abstract:
As the architecture of deep learning-based speech recognizers has recently changed to the end-to-end style, increasing the effective amount of training data has become an important issue. To tackle this issue, various data augmentation techniques to create additional training data by transforming labeled data have been studied. We propose a method called FrameAugment to augment data by changing the speed of speech locally for selected sections, which is different from the conventional speed perturbation technique that changes the speed of speech uniformly for the entire utterance. To change the speed of the selected sections of speech, the number of frames for the randomly selected sections is adjusted through linear interpolation in the spectrogram domain. The proposed method is shown to achieve 6.8% better performance than the baseline in the WSJ database and 9.5% better than the baseline in the LibriSpeech database. It is also confirmed that the proposed method further improves speech recognition performance when it is combined with the previous data augmentation techniques.
APA, Harvard, Vancouver, ISO, and other styles
10

Xie, Kai, Yuxuan Gao, Yadang Chen, and Xun Che. "Mask Mixup Model: Enhanced Contrastive Learning for Few-Shot Learning." Applied Sciences 14, no. 14 (July 11, 2024): 6063. http://dx.doi.org/10.3390/app14146063.

Full text
Abstract:
Few-shot image classification aims to improve the performance of traditional image classification when faced with limited data. Its main challenge lies in effectively utilizing sparse sample label data to accurately predict the true feature distribution. Recent approaches have employed data augmentation techniques like random Mask or mixture interpolation to enhance the diversity and generalization of labeled samples. However, these methods still encounter several issues: (1) random Mask can lead to complete blockage or exposure of foreground, causing loss of crucial sample information; and (2) uniform data distribution after mixture interpolation makes it difficult for the model to differentiate between different categories and effectively distinguish their boundaries. To address these challenges, this paper introduces a novel data augmentation method based on saliency mask blending. Firstly, it selectively preserves key image features through adaptive selection and retention using visual feature occlusion fusion and confidence clipping strategies. Secondly, a visual feature saliency fusion approach is employed to calculate the importance of various image regions, guiding the blending process to produce more diverse and enriched images with clearer category boundaries. The proposed method achieves outstanding performance on multiple standard few-shot image classification datasets (miniImageNet, tieredImageNet, Few-shot FC100, and CUB), surpassing state-of-the-art methods by approximately 0.2–1%.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Interpolation-Based data augmentation"

1

Venkataramanan, Shashanka. "Metric learning for instance and category-level visual representation." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS022.

Full text
Abstract:
Le principal objectif de la vision par ordinateur est de permettre aux machines d'extraire des informations significatives à partir de données visuelles, telles que des images et des vidéos, et de tirer parti de ces informations pour effectuer une large gamme de tâches. À cette fin, de nombreuses recherches se sont concentrées sur le développement de modèles d'apprentissage profond capables de coder des représentations visuelles complètes et robustes. Une stratégie importante dans ce contexte consiste à préentraîner des modèles sur des ensembles de données à grande échelle, tels qu'ImageNet, pour apprendre des représentations qui peuvent présenter une applicabilité transversale aux tâches et faciliter la gestion réussie de diverses tâches en aval avec un minimum d'effort. Pour faciliter l'apprentissage sur ces ensembles de données à grande échelle et coder de bonnes représentations, des stratégies complexes d'augmentation des données ont été utilisées. Cependant, ces augmentations peuvent être limitées dans leur portée, étant soit conçues manuellement et manquant de diversité, soit générant des images qui paraissent artificielles. De plus, ces techniques d'augmentation se sont principalement concentrées sur le jeu de données ImageNet et ses tâches en aval, limitant leur applicabilité à un éventail plus large de problèmes de vision par ordinateur. Dans cette thèse, nous visons à surmonter ces limitations en explorant différentes approches pour améliorer l'efficacité et l'efficience de l'apprentissage des représentations. Le fil conducteur des travaux présentés est l'utilisation de techniques basées sur l'interpolation, telles que mixup, pour générer des exemples d'entraînement diversifiés et informatifs au-delà du jeu de données original. Dans le premier travail, nous sommes motivés par l'idée de la déformation comme un moyen naturel d'interpoler des images plutôt que d'utiliser une combinaison convexe. Nous montrons que l'alignement géométrique des deux images dans l'espace des caractéristiques permet une interpolation plus naturelle qui conserve la géométrie d'une image et la texture de l'autre, la reliant au transfert de style. En nous appuyant sur ces observations, nous explorons la combinaison de mix6up et de l'apprentissage métrique profond. Nous développons une formulation généralisée qui intègre mix6up dans l'apprentissage métrique, conduisant à des représentations améliorées qui explorent des zones de l'espace d'embedding au-delà des classes d'entraînement. En nous appuyant sur ces insights, nous revisitons la motivation originale de mixup et générons un plus grand nombre d'exemples interpolés au-delà de la taille du mini-lot en interpolant dans l'espace d'embedding. Cette approche nous permet d'échantillonner sur l'ensemble de l'enveloppe convexe du mini-lot, plutôt que juste le long des segments linéaires entre les paires d'exemples. Enfin, nous explorons le potentiel de l'utilisation d'augmentations naturelles d'objets à partir de vidéos. Nous introduisons un ensemble de données "Walking Tours" de vidéos égocentriques en première personne, qui capturent une large gamme d'objets et d'actions dans des transitions de scènes naturelles. Nous proposons ensuite une nouvelle méthode de préentraînement auto-supervisée appelée DoRA, qui détecte et suit des objets dans des images vidéo, dérivant de multiples vues à partir des suivis et les utilisant de manière auto-supervisée
The primary goal in computer vision is to enable machines to extract meaningful information from visual data, such as images and videos, and leverage this information to perform a wide range of tasks. To this end, substantial research has focused on developing deep learning models capable of encoding comprehensive and robust visual representations. A prominent strategy in this context involves pretraining models on large-scale datasets, such as ImageNet, to learn representations that can exhibit cross-task applicability and facilitate the successful handling of diverse downstream tasks with minimal effort. To facilitate learning on these large-scale datasets and encode good representations, com- plex data augmentation strategies have been used. However, these augmentations can be limited in their scope, either being hand-crafted and lacking diversity, or generating images that appear unnatural. Moreover, the focus of these augmentation techniques has primarily been on the ImageNet dataset and its downstream tasks, limiting their applicability to a broader range of computer vision problems. In this thesis, we aim to tackle these limitations by exploring different approaches to en- hance the efficiency and effectiveness in representation learning. The common thread across the works presented is the use of interpolation-based techniques, such as mixup, to generate diverse and informative training examples beyond the original dataset. In the first work, we are motivated by the idea of deformation as a natural way of interpolating images rather than using a convex combination. We show that geometrically aligning the two images in the fea- ture space, allows for more natural interpolation that retains the geometry of one image and the texture of the other, connecting it to style transfer. Drawing from these observations, we explore the combination of mixup and deep metric learning. We develop a generalized formu- lation that accommodates mixup in metric learning, leading to improved representations that explore areas of the embedding space beyond the training classes. Building on these insights, we revisit the original motivation of mixup and generate a larger number of interpolated examples beyond the mini-batch size by interpolating in the embedding space. This approach allows us to sample on the entire convex hull of the mini-batch, rather than just along lin- ear segments between pairs of examples. Finally, we investigate the potential of using natural augmentations of objects from videos. We introduce a "Walking Tours" dataset of first-person egocentric videos, which capture a diverse range of objects and actions in natural scene transi- tions. We then propose a novel self-supervised pretraining method called DoRA, which detects and tracks objects in video frames, deriving multiple views from the tracks and using them in a self-supervised manner
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Interpolation-Based data augmentation"

1

Rabah, Mohamed Louay, Nedra Mellouli, and Imed Riadh Farah. "Interpolation and Prediction of Piezometric Multivariate Time Series Based on Data Augmentation and Transformers." In Lecture Notes in Networks and Systems, 327–44. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-47724-9_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Interpolation-Based data augmentation"

1

Ye, Mao, Haitao Wang, and Zheqian Chen. "MSMix: An Interpolation-Based Text Data Augmentation Method Manifold Swap Mixup." In 4th International Conference on Natural Language Processing and Machine Learning. Academy and Industry Research Collaboration Center (AIRCC), 2023. http://dx.doi.org/10.5121/csit.2023.130806.

Full text
Abstract:
To solve the problem of poor performance of deep neural network models due to insufficient data, a simple yet effective interpolation-based data augmentation method is proposed: MSMix (Manifold Swap Mixup). This method feeds two different samples to the same deep neural network model, and then randomly select a specific layer and partially replace hidden features at that layer of one of the samples by the counterpart of the other. The mixed hidden features are fed to the model and go through the rest of the network. Two different selection strategies are also proposed to obtain richer hidden representation. Experiments are conducted on three Chinese intention recognition datasets, and the results show that the MSMix method achieves better results than other methods in both full-sample and small-sample configurations.
APA, Harvard, Vancouver, ISO, and other styles
2

Heo, Jaeseung, Seungbeom Lee, Sungsoo Ahn, and Dongwoo Kim. "EPIC: Graph Augmentation with Edit Path Interpolation via Learnable Cost." In Thirty-Third International Joint Conference on Artificial Intelligence {IJCAI-24}. California: International Joint Conferences on Artificial Intelligence Organization, 2024. http://dx.doi.org/10.24963/ijcai.2024/455.

Full text
Abstract:
Data augmentation plays a critical role in improving model performance across various domains, but it becomes challenging with graph data due to their complex and irregular structure. To address this issue, we propose EPIC (Edit Path Interpolation via learnable Cost), a novel interpolation-based method for augmenting graph datasets. To interpolate between two graphs lying in an irregular domain, EPIC leverages the concept of graph edit distance, constructing an edit path that represents the transformation process between two graphs via edit operations. Moreover, our method introduces a context-sensitive cost model that accounts for the importance of specific edit operations formulated through a learning framework. This allows for a more nuanced transformation process, where the edit distance is not merely count-based but reflects meaningful graph attributes. With randomly sampled graphs from the edit path, we enrich the training set to enhance the generalization capability of classification models. Experimental evaluations across several benchmark datasets demonstrate that our approach outperforms existing augmentation techniques in many tasks.
APA, Harvard, Vancouver, ISO, and other styles
3

Li, Chen, Xutan Peng, Hao Peng, Jianxin Li, and Lihong Wang. "TextGTL: Graph-based Transductive Learning for Semi-supervised Text Classification via Structure-Sensitive Interpolation." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/369.

Full text
Abstract:
Compared with traditional sequential learning models, graph-based neural networks exhibit excellent properties when encoding text, such as the capacity of capturing global and local information simultaneously. Especially in the semi-supervised scenario, propagating information along the edge can effectively alleviate the sparsity of labeled data. In this paper, beyond the existing architecture of heterogeneous word-document graphs, for the first time, we investigate how to construct lightweight non-heterogeneous graphs based on different linguistic information to better serve free text representation learning. Then, a novel semi-supervised framework for text classification that refines graph topology under theoretical guidance and shares information across different text graphs, namely Text-oriented Graph-based Transductive Learning (TextGTL), is proposed. TextGTL also performs attribute space interpolation based on dense substructure in graphs to predict low-entropy labels with high-quality feature nodes for data augmentation. To verify the effectiveness of TextGTL, we conduct extensive experiments on various benchmark datasets, observing significant performance gains over conventional heterogeneous graphs. In addition, we also design ablation studies to dive deep into the validity of components in TextTGL.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography