Acceder

Bibliografías temáticas / Dataset shift / Artículos de revistas

Siga este enlace para ver otros tipos de publicaciones sobre el tema: Dataset shift.

Artículos de revistas sobre el tema "Dataset shift"

Autor: Grafiati

Publicado: 25 de mayo de 2024

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte los 50 mejores artículos de revistas para su investigación sobre el tema "Dataset shift".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Explore artículos de revistas sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.

1

Sharet, Nir y Ilan Shimshoni. "Analyzing Data Changes using Mean Shift Clustering". International Journal of Pattern Recognition and Artificial Intelligence 30, n.º 07 (25 de mayo de 2016): 1650016. http://dx.doi.org/10.1142/s0218001416500166.

Texto completo

Resumen

A nonparametric unsupervised method for analyzing changes in complex datasets is proposed. It is based on the mean shift clustering algorithm. Mean shift is used to cluster the old and new datasets and compare the results in a nonparametric manner. Each point from the new dataset naturally belongs to a cluster of points from its dataset. The method is also able to find to which cluster the point belongs in the old dataset and use this information to report qualitative differences between that dataset and the new one. Changes in local cluster distribution are also reported. The report can then be used to try to understand the underlying reasons which caused the changes in the distributions. On the basis of this method, a transductive transfer learning method for automatically labeling data from the new dataset is also proposed. This labeled data is used, in addition to the old training set, to train a classifier better suited to the new dataset. The algorithm has been implemented and tested on simulated and real (a stereo image pair) datasets. Its performance was also compared with several state-of-the-art methods.

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Adams, Niall. "Dataset Shift in Machine Learning". Journal of the Royal Statistical Society: Series A (Statistics in Society) 173, n.º 1 (enero de 2010): 274. http://dx.doi.org/10.1111/j.1467-985x.2009.00624_10.x.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Guo, Lin Lawrence, Stephen R. Pfohl, Jason Fries, Jose Posada, Scott Lanyon Fleming, Catherine Aftandilian, Nigam Shah y Lillian Sung. "Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine". Applied Clinical Informatics 12, n.º 04 (agosto de 2021): 808–15. http://dx.doi.org/10.1055/s-0041-1735184.

Texto completo

Resumen

Abstract Objective The change in performance of machine learning models over time as a result of temporal dataset shift is a barrier to machine learning-derived models facilitating decision-making in clinical practice. Our aim was to describe technical procedures used to preserve the performance of machine learning models in the presence of temporal dataset shifts. Methods Studies were included if they were fully published articles that used machine learning and implemented a procedure to mitigate the effects of temporal dataset shift in a clinical setting. We described how dataset shift was measured, the procedures used to preserve model performance, and their effects. Results Of 4,457 potentially relevant publications identified, 15 were included. The impact of temporal dataset shift was primarily quantified using changes, usually deterioration, in calibration or discrimination. Calibration deterioration was more common (n = 11) than discrimination deterioration (n = 3). Mitigation strategies were categorized as model level or feature level. Model-level approaches (n = 15) were more common than feature-level approaches (n = 2), with the most common approaches being model refitting (n = 12), probability calibration (n = 7), model updating (n = 6), and model selection (n = 6). In general, all mitigation strategies were successful at preserving calibration but not uniformly successful in preserving discrimination. Conclusion There was limited research in preserving the performance of machine learning models in the presence of temporal dataset shift in clinical medicine. Future research could focus on the impact of dataset shift on clinical decision making, benchmark the mitigation strategies on a wider range of datasets and tasks, and identify optimal strategies for specific settings.

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

He, Zhiqiang. "ECG Heartbeat Classification Under Dataset Shift". Journal of Intelligent Medicine and Healthcare 1, n.º 2 (2022): 79–89. http://dx.doi.org/10.32604/jimh.2022.036624.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Kim, Doyoung, Inwoong Lee, Dohyung Kim y Sanghoon Lee. "Action Recognition Using Close-Up of Maximum Activation and ETRI-Activity3D LivingLab Dataset". Sensors 21, n.º 20 (12 de octubre de 2021): 6774. http://dx.doi.org/10.3390/s21206774.

Texto completo

Resumen

The development of action recognition models has shown great performance on various video datasets. Nevertheless, because there is no rich data on target actions in existing datasets, it is insufficient to perform action recognition applications required by industries. To satisfy this requirement, datasets composed of target actions with high availability have been created, but it is difficult to capture various characteristics in actual environments because video data are generated in a specific environment. In this paper, we introduce a new ETRI-Activity3D-LivingLab dataset, which provides action sequences in actual environments and helps to handle a network generalization issue due to the dataset shift. When the action recognition model is trained on the ETRI-Activity3D and KIST SynADL datasets and evaluated on the ETRI-Activity3D-LivingLab dataset, the performance can be severely degraded because the datasets were captured in different environments domains. To reduce this dataset shift between training and testing datasets, we propose a close-up of maximum activation, which magnifies the most activated part of a video input in detail. In addition, we present various experimental results and analysis that show the dataset shift and demonstrate the effectiveness of the proposed method.

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

McGaughey, Georgia, W. Patrick Walters y Brian Goldman. "Understanding covariate shift in model performance". F1000Research 5 (7 de abril de 2016): 597. http://dx.doi.org/10.12688/f1000research.8317.1.

Texto completo

Resumen

Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

McGaughey, Georgia, W. Patrick Walters y Brian Goldman. "Understanding covariate shift in model performance". F1000Research 5 (17 de junio de 2016): 597. http://dx.doi.org/10.12688/f1000research.8317.2.

Texto completo

Resumen

Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

McGaughey, Georgia, W. Patrick Walters y Brian Goldman. "Understanding covariate shift in model performance". F1000Research 5 (17 de octubre de 2016): 597. http://dx.doi.org/10.12688/f1000research.8317.3.

Texto completo

Resumen

Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Becker, Aneta y Jarosław Becker. "Dataset shift assessment measures in monitoring predictive models". Procedia Computer Science 192 (2021): 3391–402. http://dx.doi.org/10.1016/j.procs.2021.09.112.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Finlayson, Samuel G., Adarsh Subbaswamy, Karandeep Singh, John Bowers, Annabel Kupke, Jonathan Zittrain, Isaac S. Kohane y Suchi Saria. "The Clinician and Dataset Shift in Artificial Intelligence". New England Journal of Medicine 385, n.º 3 (15 de julio de 2021): 283–86. http://dx.doi.org/10.1056/nejmc2104626.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

11

Moreno-Torres, Jose G., Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla y Francisco Herrera. "A unifying view on dataset shift in classification". Pattern Recognition 45, n.º 1 (enero de 2012): 521–30. http://dx.doi.org/10.1016/j.patcog.2011.06.019.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

12

Subbaswamy, Adarsh, Bryant Chen y Suchi Saria. "A unifying causal framework for analyzing dataset shift-stable learning algorithms". Journal of Causal Inference 10, n.º 1 (1 de enero de 2022): 64–89. http://dx.doi.org/10.1515/jci-2021-0042.

Texto completo

Resumen

Abstract Recent interest in the external validity of prediction models (i.e., the problem of different train and test distributions, known as dataset shift) has produced many methods for finding predictive distributions that are invariant to dataset shifts and can be used for prediction in new, unseen environments. However, these methods consider different types of shifts and have been developed under disparate frameworks, making it difficult to theoretically analyze how solutions differ with respect to stability and accuracy. Taking a causal graphical view, we use a flexible graphical representation to express various types of dataset shifts. Given a known graph of the data generating process, we show that all invariant distributions correspond to a causal hierarchy of graphical operators, which disable the edges in the graph that are responsible for the shifts. The hierarchy provides a common theoretical underpinning for understanding when and how stability to shifts can be achieved, and in what ways stable distributions can differ. We use it to establish conditions for minimax optimal performance across environments, and derive new algorithms that find optimal stable distributions. By using this new perspective, we empirically demonstrate that that there is a tradeoff between minimax and average performance.

Los estilos APA, Harvard, Vancouver, ISO, etc.

13

Xie, Y., K. Schindler, J. Tian y X. X. Zhu. "EXPLORING CROSS-CITY SEMANTIC SEGMENTATION OF ALS POINT CLOUDS". International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2021 (28 de junio de 2021): 247–54. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2021-247-2021.

Texto completo

Resumen

Abstract. Deep learning models achieve excellent semantic segmentation results for airborne laser scanning (ALS) point clouds, if sufficient training data are provided. Increasing amounts of annotated data are becoming publicly available thanks to contributors from all over the world. However, models trained on a specific dataset typically exhibit poor performance on other datasets. I.e., there are significant domain shifts, as data captured in different environments or by distinct sensors have different distributions. In this work, we study this domain shift and potential strategies to mitigate it, using two popular ALS datasets: the ISPRS Vaihingen benchmark from Germany and the LASDU benchmark from China. We compare different training strategies for cross-city ALS point cloud semantic segmentation. In our experiments, we analyse three factors that may lead to domain shift and affect the learning: point cloud density, LiDAR intensity, and the role of data augmentation. Moreover, we evaluate a well-known standard method of domain adaptation, deep CORAL (Sun and Saenko, 2016). In our experiments, adapting the point cloud density and appropriate data augmentation both help to reduce the domain gap and improve segmentation accuracy. On the contrary, intensity features can bring an improvement within a dataset, but deteriorate the generalisation across datasets. Deep CORAL does not further improve the accuracy over the simple adaptation of density and data augmentation, although it can mitigate the impact of improperly chosen point density, intensity features, and further dataset biases like lack of diversity.

Los estilos APA, Harvard, Vancouver, ISO, etc.

14

ZHAO, YUZHONG, BABAK ALIPANAHI, SHUAI CHENG LI y MING LI. "PROTEIN SECONDARY STRUCTURE PREDICTION USING NMR CHEMICAL SHIFT DATA". Journal of Bioinformatics and Computational Biology 08, n.º 05 (octubre de 2010): 867–84. http://dx.doi.org/10.1142/s0219720010004987.

Texto completo

Resumen

Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q3 accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at .

Los estilos APA, Harvard, Vancouver, ISO, etc.

15

Chakraborty, Saptarshi, Debolina Paul y Swagatam Das. "Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 8 (18 de mayo de 2021): 6930–38. http://dx.doi.org/10.1609/aaai.v35i8.16854.

Texto completo

Resumen

Mean shift is a simple interactive procedure that gradually shifts data points towards the mode which denotes the highest density of data points in the region. Mean shift algorithms have been effectively used for data denoising, mode seeking, and finding the number of clusters in a dataset in an automated fashion. However, the merits of mean shift quickly fade away as the data dimensions increase and only a handful of features contain useful information about the cluster structure of the data. We propose a simple yet elegant feature-weighted variant of mean shift to efficiently learn the feature importance and thus, extending the merits of mean shift to high-dimensional data. The resulting algorithm not only outperforms the conventional mean shift clustering procedure but also preserves its computational simplicity. In addition, the proposed method comes with rigorous theoretical convergence guarantees and a convergence rate of at least a cubic order. The efficacy of our proposal is thoroughly assessed through experimental comparison against baseline and state-of-the-art clustering methods on synthetic as well as real-world datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

16

Tasche, Dirk. "Factorizable Joint Shift in Multinomial Classification". Machine Learning and Knowledge Extraction 4, n.º 3 (10 de septiembre de 2022): 779–802. http://dx.doi.org/10.3390/make4030038.

Texto completo

Resumen

Factorizable joint shift (FJS) was recently proposed as a type of dataset shift for which the complete characteristics can be estimated from feature data observations on the test dataset by a method called Joint Importance Aligning. For the multinomial (multiclass) classification setting, we derive a representation of factorizable joint shift in terms of the source (training) distribution, the target (test) prior class probabilities and the target marginal distribution of the features. On the basis of this result, we propose alternatives to joint importance aligning and, at the same time, point out that factorizable joint shift is not fully identifiable if no class label information on the test dataset is available and no additional assumptions are made. Other results of the paper include correction formulae for the posterior class probabilities both under general dataset shift and factorizable joint shift. In addition, we investigate the consequences of assuming factorizable joint shift for the bias caused by sample selection.

Los estilos APA, Harvard, Vancouver, ISO, etc.

17

Xue, Zhiyun, Feng Yang, Sivaramakrishnan Rajaraman, Ghada Zamzmi y Sameer Antani. "Cross Dataset Analysis of Domain Shift in CXR Lung Region Detection". Diagnostics 13, n.º 6 (11 de marzo de 2023): 1068. http://dx.doi.org/10.3390/diagnostics13061068.

Texto completo

Resumen

Domain shift is one of the key challenges affecting reliability in medical imaging-based machine learning predictions. It is of significant importance to investigate this issue to gain insights into its characteristics toward determining controllable parameters to minimize its impact. In this paper, we report our efforts on studying and analyzing domain shift in lung region detection in chest radiographs. We used five chest X-ray datasets, collected from different sources, which have manual markings of lung boundaries in order to conduct extensive experiments toward this goal. We compared the characteristics of these datasets from three aspects: information obtained from metadata or an image header, image appearance, and features extracted from a pretrained model. We carried out experiments to evaluate and compare model performances within each dataset and across datasets in four scenarios using different combinations of datasets. We proposed a new feature visualization method to provide explanations for the applied object detection network on the obtained quantitative results. We also examined chest X-ray modality-specific initialization, catastrophic forgetting, and model repeatability. We believe the observations and discussions presented in this work could help to shed some light on the importance of the analysis of training data for medical imaging machine learning research, and could provide valuable guidance for domain shift analysis.

Los estilos APA, Harvard, Vancouver, ISO, etc.

18

Sáez, José A. y José L. Romero-Béjar. "Impact of Regressand Stratification in Dataset Shift Caused by Cross-Validation". Mathematics 10, n.º 14 (21 de julio de 2022): 2538. http://dx.doi.org/10.3390/math10142538.

Texto completo

Resumen

Data that have not been modeled cannot be correctly predicted. Under this assumption, this research studies how k-fold cross-validation can introduce dataset shift in regression problems. This fact implies data distributions in the training and test sets to be different and, therefore, a deterioration of the model performance estimation. Even though the stratification of the output variable is widely used in the field of classification to reduce the impacts of dataset shift induced by cross-validation, its use in regression is not widespread in the literature. This paper analyzes the consequences for dataset shift of including different regressand stratification schemes in cross-validation with regression data. The results obtained show that these allow for creating more similar training and test sets, reducing the presence of dataset shift related to cross-validation. The bias and deviation of the performance estimation results obtained by regression algorithms are improved using the highest amounts of strata, as are the number of cross-validation repetitions necessary to obtain these better results.

Los estilos APA, Harvard, Vancouver, ISO, etc.

19

Turhan, Burak. "On the dataset shift problem in software engineering prediction models". Empirical Software Engineering 17, n.º 1-2 (12 de octubre de 2011): 62–74. http://dx.doi.org/10.1007/s10664-011-9182-8.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

20

Becker, Jarosław y Aneta Becker. "Predictive Accuracy Index in evaluating the dataset shift (case study)". Procedia Computer Science 225 (2023): 3342–51. http://dx.doi.org/10.1016/j.procs.2023.10.328.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

21

Aryal, Jagannath y Bipul Neupane. "Multi-Scale Feature Map Aggregation and Supervised Domain Adaptation of Fully Convolutional Networks for Urban Building Footprint Extraction". Remote Sensing 15, n.º 2 (13 de enero de 2023): 488. http://dx.doi.org/10.3390/rs15020488.

Texto completo

Resumen

Automated building footprint extraction requires the Deep Learning (DL)-based semantic segmentation of high-resolution Earth observation images. Fully convolutional networks (FCNs) such as U-Net and ResUNET are widely used for such segmentation. The evolving FCNs suffer from the inadequate use of multi-scale feature maps in their backbone of convolutional neural networks (CNNs). Furthermore, the DL methods are not robust in cross-domain settings due to domain-shift problems. Two scale-robust novel networks, namely MSA-UNET and MSA-ResUNET, are developed in this study by aggregating the multi-scale feature maps in U-Net and ResUNET with partial concepts of the feature pyramid network (FPN). Furthermore, supervised domain adaptation is investigated to minimise the effects of domain-shift between the two datasets. The datasets include the benchmark WHU Building dataset and a developed dataset with 5× fewer samples, 4× lower spatial resolution and complex high-rise buildings and skyscrapers. The newly developed networks are compared to six state-of-the-art FCNs using five metrics: pixel accuracy, adjusted accuracy, F1 score, intersection over union (IoU), and the Matthews Correlation Coefficient (MCC). The proposed networks outperform the FCNs in the majority of the accuracy measures in both datasets. Compared to the larger dataset, the network trained on the smaller one shows significantly higher robustness in terms of adjusted accuracy (by 18%), F1 score (by 31%), IoU (by 27%), and MCC (by 29%) during the cross-domain validation of MSA-UNET. MSA-ResUNET shows similar improvements, concluding that the proposed networks when trained using domain adaptation increase the robustness and minimise the domain-shift between the datasets of different complexity.

Los estilos APA, Harvard, Vancouver, ISO, etc.

22

Peng, Zhiyong, Changlin Han, Yadong Liu y Zongtan Zhou. "Weighted Policy Constraints for Offline Reinforcement Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 8 (26 de junio de 2023): 9435–43. http://dx.doi.org/10.1609/aaai.v37i8.26130.

Texto completo

Resumen

Offline reinforcement learning (RL) aims to learn policy from the passively collected offline dataset. Applying existing RL methods on the static dataset straightforwardly will raise distribution shift, causing these unconstrained RL methods to fail. To cope with the distribution shift problem, a common practice in offline RL is to constrain the policy explicitly or implicitly close to behavioral policy. However, the available dataset usually contains sub-optimal or inferior actions, constraining the policy near all these actions will make the policy inevitably learn inferior behaviors, limiting the performance of the algorithm. Based on this observation, we propose a weighted policy constraints (wPC) method that only constrains the learned policy to desirable behaviors, making room for policy improvement on other parts. Our algorithm outperforms existing state-of-the-art offline RL algorithms on the D4RL offline gym datasets. Moreover, the proposed algorithm is simple to implement with few hyper-parameters, making the proposed wPC algorithm a robust offline RL method with low computational complexity.

Los estilos APA, Harvard, Vancouver, ISO, etc.

23

Phongsasiri, Siriwan y Suwanna Rasmequan. "Outlier Detection in Wellness Data using Probabilistic Mapped Mean-Shift Algorithms". ECTI Transactions on Computer and Information Technology (ECTI-CIT) 15, n.º 2 (11 de agosto de 2021): 258–66. http://dx.doi.org/10.37936/ecti-cit.2021152.244971.

Texto completo

Resumen

In this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition, data distribution calculation, and data point reposition. Truncated Gaussian Distribution (TGD) was used for controlling the boundary of the mapped points. Second, the Outlier Detection step consists of outlier score calculation and outlier selection. Experimental results show that the proposed algorithm outperformed the existing algorithms with real-world benchmark datasets and a Children’s Wellness Clinic dataset (CWD). Outlier detection accuracy obtained from the proposed algorithm based on Wellness, Stamps, Arrhythmia, Pima, and Parkinson datasets was 93%, 94%, 80%, 75%, and 72%, respectively.

Los estilos APA, Harvard, Vancouver, ISO, etc.

24

Rodriguez-Vazquez, Javier, Miguel Fernandez-Cortizas, David Perez-Saura, Martin Molina y Pascual Campoy. "Overcoming Domain Shift in Neural Networks for Accurate Plant Counting in Aerial Images". Remote Sensing 15, n.º 6 (22 de marzo de 2023): 1700. http://dx.doi.org/10.3390/rs15061700.

Texto completo

Resumen

This paper presents a novel semi-supervised approach for accurate counting and localization of tropical plants in aerial images that can work in new visual domains in which the available data are not labeled. Our approach uses deep learning and domain adaptation, designed to handle domain shifts between the training and test data, which is a common challenge in this agricultural applications. This method uses a source dataset with annotated plants and a target dataset without annotations and adapts a model trained on the source dataset to the target dataset using unsupervised domain alignment and pseudolabeling. The experimental results show the effectiveness of this approach for plant counting in aerial images of pineapples under significative domain shift, achieving a reduction up to 97% in the counting error (1.42 in absolute count) when compared to the supervised baseline (48.6 in absolute count).

Los estilos APA, Harvard, Vancouver, ISO, etc.

25

Tappy, Nicolas, Anna Fontcuberta i Morral y Christian Monachon. "Image shift correction, noise analysis, and model fitting of (cathodo-)luminescence hyperspectral maps". Review of Scientific Instruments 93, n.º 5 (1 de mayo de 2022): 053702. http://dx.doi.org/10.1063/5.0080486.

Texto completo

Resumen

Hyperspectral imaging is an important asset of modern spectroscopy. It allows us to perform optical metrology at a high spatial resolution, for example in cathodoluminescence in scanning electron microscopy. However, hyperspectral datasets present added challenges in their analysis compared to individually taken spectra due to their lower signal to noise ratio and specific aberrations. On the other hand, the large volume of information in a hyperspectral dataset allows the application of advanced statistical analysis methods derived from machine-learning. In this article, we present a methodology to perform model fitting on hyperspectral maps, leveraging principal component analysis to perform a thorough noise analysis of the dataset. We explain how to correct the imaging shift artifact, specific to imaging spectroscopy, by directly evaluating it from the data. The impact of goodness-of-fit-indicators and parameter uncertainties is discussed. We provide indications on how to apply this technique to a variety of hyperspectral datasets acquired using other experimental techniques. As a practical example, we provide an implementation of this analysis using the open-source Python library hyperspy, which is implemented using the well established Jupyter Notebook framework in the scientific community.

Los estilos APA, Harvard, Vancouver, ISO, etc.

26

Wang, Li, Dong Li, Han Liu, JinZhang Peng, Lu Tian y Yi Shan. "Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 3 (28 de junio de 2022): 2487–94. http://dx.doi.org/10.1609/aaai.v36i3.20149.

Texto completo

Resumen

Semantic segmentation is an important task for scene understanding in self-driving cars and robotics, which aims to assign dense labels for all pixels in the image. Existing work typically improves semantic segmentation performance by exploring different network architectures on a target dataset. Little attention has been paid to build a unified system by simultaneously learning from multiple datasets due to the inherent distribution shift across different datasets. In this paper, we propose a simple, flexible, and general method for semantic segmentation, termed Cross-Dataset Collaborative Learning (CDCL). Our goal is to train a unified model for improving the performance in each dataset by leveraging information from all the datasets. Specifically, we first introduce a family of Dataset-Aware Blocks (DAB) as the fundamental computing units of the network, which help capture homogeneous convolutional representations and heterogeneous statistics across different datasets. Second, we present a Dataset Alternation Training (DAT) mechanism to facilitate the collaborative optimization procedure. We conduct extensive evaluations on diverse semantic segmentation datasets for autonomous driving. Experiments demonstrate that our method consistently achieves notable improvements over prior single-dataset and cross-dataset training methods without introducing extra FLOPs. Particularly, with the same architecture of PSPNet (ResNet-18), our method outperforms the single-dataset baseline by 5.65\%, 6.57\%, and 5.79\% mIoU on the validation sets of Cityscapes, BDD100K, CamVid, respectively. We also apply CDCL for point cloud 3D semantic segmentation and achieve improved performance, which further validates the superiority and generality of our method. Code and models will be released.

Los estilos APA, Harvard, Vancouver, ISO, etc.

27

He, Yue, Xinwei Shen, Renzhe Xu, Tong Zhang, Yong Jiang, Wenchao Zou y Peng Cui. "Covariate-Shift Generalization via Random Sample Weighting". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 10 (26 de junio de 2023): 11828–36. http://dx.doi.org/10.1609/aaai.v37i10.26396.

Texto completo

Resumen

Shifts in the marginal distribution of covariates from training to the test phase, named covariate-shifts, often lead to unstable prediction performance across agnostic testing data, especially under model misspecification. Recent literature on invariant learning attempts to learn an invariant predictor from heterogeneous environments. However, the performance of the learned predictor depends heavily on the availability and quality of provided environments. In this paper, we propose a simple and effective non-parametric method for generating heterogeneous environments via Random Sample Weighting (RSW). Given the training dataset from a single source environment, we randomly generate a set of covariate-determining sample weights and use each weighted training distribution to simulate an environment. We theoretically show that under appropriate conditions, such random sample weighting can produce sufficient heterogeneity to be exploited by common invariance constraints to find the invariant variables for stable prediction under covariate shifts. Extensive experiments on both simulated and real-world datasets clearly validate the effectiveness of our method.

Los estilos APA, Harvard, Vancouver, ISO, etc.

28

Hong, Zhiqing, Zelong Li, Shuxin Zhong, Wenjun Lyu, Haotian Wang, Yi Ding, Tian He y Desheng Zhang. "CrossHAR: Generalizing Cross-dataset Human Activity Recognition via Hierarchical Self-Supervised Pretraining". Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, n.º 2 (13 de mayo de 2024): 1–26. http://dx.doi.org/10.1145/3659597.

Texto completo

Resumen

The increasing availability of low-cost wearable devices and smartphones has significantly advanced the field of sensor-based human activity recognition (HAR), attracting considerable research interest. One of the major challenges in HAR is the domain shift problem in cross-dataset activity recognition, which occurs due to variations in users, device types, and sensor placements between the source dataset and the target dataset. Although domain adaptation methods have shown promise, they typically require access to the target dataset during the training process, which might not be practical in some scenarios. To address these issues, we introduce CrossHAR, a new HAR model designed to improve model performance on unseen target datasets. CrossHAR involves three main steps: (i) CrossHAR explores the sensor data generation principle to diversify the data distribution and augment the raw sensor data. (ii) CrossHAR then employs a hierarchical self-supervised pretraining approach with the augmented data to develop a generalizable representation. (iii) Finally, CrossHAR fine-tunes the pretrained model with a small set of labeled data in the source dataset, enhancing its performance in cross-dataset HAR. Our extensive experiments across multiple real-world HAR datasets demonstrate that CrossHAR outperforms current state-of-the-art methods by 10.83% in accuracy, demonstrating its effectiveness in generalizing to unseen target datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

29

Wei, Weiwei, Yuxuan Liao, Yufei Wang, Shaoqi Wang, Wen Du, Hongmei Lu, Bo Kong, Huawu Yang y Zhimin Zhang. "Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures". Molecules 27, n.º 12 (7 de junio de 2022): 3653. http://dx.doi.org/10.3390/molecules27123653.

Texto completo

Resumen

Nuclear magnetic resonance (NMR) spectroscopy is highly unbiased and reproducible, which provides us a powerful tool to analyze mixtures consisting of small molecules. However, the compound identification in NMR spectra of mixtures is highly challenging because of chemical shift variations of the same compound in different mixtures and peak overlapping among molecules. Here, we present a pseudo-Siamese convolutional neural network method (pSCNN) to identify compounds in mixtures for NMR spectroscopy. A data augmentation method was implemented for the superposition of several NMR spectra sampled from a spectral database with random noises. The augmented dataset was split and used to train, validate and test the pSCNN model. Two experimental NMR datasets (flavor mixtures and additional flavor mixture) were acquired to benchmark its performance in real applications. The results show that the proposed method can achieve good performances in the augmented test set (ACC = 99.80%, TPR = 99.70% and FPR = 0.10%), the flavor mixtures dataset (ACC = 97.62%, TPR = 96.44% and FPR = 2.29%) and the additional flavor mixture dataset (ACC = 91.67%, TPR = 100.00% and FPR = 10.53%). We have demonstrated that the translational invariance of convolutional neural networks can solve the chemical shift variation problem in NMR spectra. In summary, pSCNN is an off-the-shelf method to identify compounds in mixtures for NMR spectroscopy because of its accuracy in compound identification and robustness to chemical shift variation.

Los estilos APA, Harvard, Vancouver, ISO, etc.

30

Blanza, J., X. E. Cabasal, J. B. Cipriano, G. A. Guerrero, R. Y. Pescador y E. V. Rivera. "Indoor Wireless Multipaths Outlier Detection and Clustering". Journal of Physics: Conference Series 2356, n.º 1 (1 de octubre de 2022): 012037. http://dx.doi.org/10.1088/1742-6596/2356/1/012037.

Texto completo

Resumen

Wireless communication systems have grown and developed significantly in recent years to fulfill the growing demand for high data rates across a wireless medium. Channel models have been used to develop various sturdy wireless systems for indoor and outdoor applications, and these are simulated in the form of datasets. The presence of outliers in clusters has been a concern in datasets, as it affects the standard deviation and mean of the dataset which reduces the data accuracy. In this study, the outliers in the Cooperation in Science and Technology (COST) 2100 MIMO channel model dataset were shifted to the means of the clusters using the Mean Shift Outlier Detection method. Afterward, the data is clustered using simultaneous clustering and model selection matrix affinity (SCAMSMA). The Mean Shift Outlier Detection method identified 52 and 46 multipaths as outliers and improved the clustering accuracy of the indoor scenarios by 3.5% and 0.93%, respectively. It also increased the precision of the clustering based on the decrease in standard deviation of the Jaccard indices from 0.2435 to 0.1807 and 0.3038 to 0.2075.

Los estilos APA, Harvard, Vancouver, ISO, etc.

31

Goel, Parth y Amit Ganatra. "Unsupervised Domain Adaptation for Image Classification and Object Detection Using Guided Transfer Learning Approach and JS Divergence". Sensors 23, n.º 9 (30 de abril de 2023): 4436. http://dx.doi.org/10.3390/s23094436.

Texto completo

Resumen

Unsupervised domain adaptation (UDA) is a transfer learning technique utilized in deep learning. UDA aims to reduce the distribution gap between labeled source and unlabeled target domains by adapting a model through fine-tuning. Typically, UDA approaches assume the same categories in both domains. The effectiveness of transfer learning depends on the degree of similarity between the domains, which determines an efficient fine-tuning strategy. Furthermore, domain-specific tasks generally perform well when the feature distributions of the domains are similar. However, utilizing a trained source model directly in the target domain may not generalize effectively due to domain shift. Domain shift can be caused by intra-class variations, camera sensor variations, background variations, and geographical changes. To address these issues, we design an efficient unsupervised domain adaptation network for image classification and object detection that can learn transferable feature representations and reduce the domain shift problem in a unified network. We propose the guided transfer learning approach to select the layers for fine-tuning the model, which enhances feature transferability and utilizes the JS-Divergence to minimize the domain discrepancy between the domains. We evaluate our proposed approaches using multiple benchmark datasets. Our domain adaptive image classification approach achieves 93.2% accuracy on the Office-31 dataset and 75.3% accuracy on the Office-Home dataset. In addition, our domain adaptive object detection approach achieves 51.1% mAP on the Foggy Cityscapes dataset and 72.7% mAP on the Indian Vehicle dataset. We conduct extensive experiments and ablation studies to demonstrate the effectiveness and efficiency of our work. Experimental results also show that our work significantly outperforms the existing methods.

Los estilos APA, Harvard, Vancouver, ISO, etc.

32

Kushol, Rafsanjany, Alan H. Wilman, Sanjay Kalra y Yee-Hong Yang. "DSMRI: Domain Shift Analyzer for Multi-Center MRI Datasets". Diagnostics 13, n.º 18 (14 de septiembre de 2023): 2947. http://dx.doi.org/10.3390/diagnostics13182947.

Texto completo

Resumen

In medical research and clinical applications, the utilization of MRI datasets from multiple centers has become increasingly prevalent. However, inherent variability between these centers presents challenges due to domain shift, which can impact the quality and reliability of the analysis. Regrettably, the absence of adequate tools for domain shift analysis hinders the development and validation of domain adaptation and harmonization techniques. To address this issue, this paper presents a novel Domain Shift analyzer for MRI (DSMRI) framework designed explicitly for domain shift analysis in multi-center MRI datasets. The proposed model assesses the degree of domain shift within an MRI dataset by leveraging various MRI-quality-related metrics derived from the spatial domain. DSMRI also incorporates features from the frequency domain to capture low- and high-frequency information about the image. It further includes the wavelet domain features by effectively measuring the sparsity and energy present in the wavelet coefficients. Furthermore, DSMRI introduces several texture features, thereby enhancing the robustness of the domain shift analysis process. The proposed framework includes visualization techniques such as t-SNE and UMAP to demonstrate that similar data are grouped closely while dissimilar data are in separate clusters. Additionally, quantitative analysis is used to measure the domain shift distance, domain classification accuracy, and the ranking of significant features. The effectiveness of the proposed approach is demonstrated using experimental evaluations on seven large-scale multi-site neuroimaging datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

33

Sinha, Samarth, Homanga Bharadhwaj, Anirudh Goyal, Hugo Larochelle, Animesh Garg y Florian Shkurti. "DIBS: Diversity Inducing Information Bottleneck in Model Ensembles". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 11 (18 de mayo de 2021): 9666–74. http://dx.doi.org/10.1609/aaai.v35i11.17163.

Texto completo

Resumen

Although deep learning models have achieved state-of-the art performance on a number of vision tasks, generalization over high dimensional multi-modal data, and reliable predictive uncertainty estimation are still active areas of research. Bayesian approaches including Bayesian Neural Nets (BNNs) do not scale well to modern computer vision tasks, as they are difficult to train, and have poor generalization under dataset-shift. This motivates the need for effective ensembles which can generalize and give reliable uncertainty estimates. In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning the stochastic latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. We evaluate our method on benchmark datasets: MNIST, CIFAR100, TinyImageNet and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. over 10% relative improvement in classification accuracy, over 5% relative improvement in generalizing under dataset shift, and over 5% better predictive uncertainty estimation as inferred by efficient out-of-distribution (OOD) detection.

Los estilos APA, Harvard, Vancouver, ISO, etc.

34

Heffington, Colton, Brandon Beomseob Park y Laron K. Williams. "The “Most Important Problem” Dataset (MIPD): a new dataset on American issue importance". Conflict Management and Peace Science 36, n.º 3 (31 de marzo de 2017): 312–35. http://dx.doi.org/10.1177/0738894217691463.

Texto completo

Resumen

This article introduces the Most Important Problem Dataset (MIPD). The MIPD provides individual-level responses by Americans to “most important problem” questions from 1939 to 2015 coded into 58 different problem categories. The MIPD also contains individual-level information on demographics, economic evaluations, partisan preferences, approval and party competencies. This dataset can help answer questions about how the public prioritizes all problems, domestic and foreign, and we demonstrate how these data can shed light on how circumstances influence foreign policy attentiveness. Our exploratory analysis of foreign policy issue attention reveals some notable patterns about foreign policy public opinion. First, foreign policy issues rarely eclipse economic issues on the public’s problem agenda, so efforts to shift attention from poor economic performance to foreign policy via diversionary maneuvers are unlikely to be successful in the long term. Second, we find no evidence that partisan preferences—whether characterized as partisan identification or ideology—motivate partisans to prioritize different problems owing to perceptions of issue ownership. Instead, Republicans and Democrats, conservatives and liberals, respond in similar fashions to shifting domestic and international conditions.

Los estilos APA, Harvard, Vancouver, ISO, etc.

35

Guo, Fumin, Matthew Ng, Maged Goubran, Steffen E. Petersen, Stefan K. Piechnik, Stefan Neubauer y Graham Wright. "Improving cardiac MRI convolutional neural network segmentation on small training datasets and dataset shift: A continuous kernel cut approach". Medical Image Analysis 61 (abril de 2020): 101636. http://dx.doi.org/10.1016/j.media.2020.101636.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

36

Vescovi, R. F. C., M. B. Cardoso y E. X. Miqueles. "Radiography registration for mosaic tomography". Journal of Synchrotron Radiation 24, n.º 3 (7 de abril de 2017): 686–94. http://dx.doi.org/10.1107/s1600577517001953.

Texto completo

Resumen

A hybrid method of stitching X-ray computed tomography (CT) datasets is proposed and the feasibility to apply the scheme in a synchrotron tomography beamline with micrometre resolution is shown. The proposed method enables the field of view of the system to be extended while spatial resolution and experimental setup remain unchanged. The approach relies on taking full tomographic datasets at different positions in a mosaic array and registering the frames using Fourier phase correlation and a residue-based correlation. To ensure correlation correctness, the limits for the shifts are determined from the experimental motor position readouts. The masked correlation image is then minimized to obtain the correct shift. The partial datasets are blended in the sinogram space to be compatible with common CT reconstructors. The feasibility to use the algorithm to blend the partial datasets in projection space is also shown, creating a new single dataset, and standard reconstruction algorithms are used to restore high-resolution slices even with a small number of projections.

Los estilos APA, Harvard, Vancouver, ISO, etc.

37

Traynor, Carlos, Tarjinder Sahota, Helen Tomkinson, Ignacio Gonzalez-Garcia, Neil Evans y Michael Chappell. "Imputing Biomarker Status from RWE Datasets—A Comparative Study". Journal of Personalized Medicine 11, n.º 12 (13 de diciembre de 2021): 1356. http://dx.doi.org/10.3390/jpm11121356.

Texto completo

Resumen

Missing data is a universal problem in analysing Real-World Evidence (RWE) datasets. In RWE datasets, there is a need to understand which features best correlate with clinical outcomes. In this context, the missing status of several biomarkers may appear as gaps in the dataset that hide meaningful values for analysis. Imputation methods are general strategies that replace missing values with plausible values. Using the Flatiron NSCLC dataset, including more than 35,000 subjects, we compare the imputation performance of six such methods on missing data: predictive mean matching, expectation-maximisation, factorial analysis, random forest, generative adversarial networks and multivariate imputations with tabular networks. We also conduct extensive synthetic data experiments with structural causal models. Statistical learning from incomplete datasets should select an appropriate imputation algorithm accounting for the nature of missingness, the impact of missing data, and the distribution shift induced by the imputation algorithm. For our synthetic data experiments, tabular networks had the best overall performance. Methods using neural networks are promising for complex datasets with non-linearities. However, conventional methods such as predictive mean matching work well for the Flatiron NSCLC biomarker dataset.

Los estilos APA, Harvard, Vancouver, ISO, etc.

38

Wang, Xiaoyang, Chen Li, Jianqiao Zhao y Dong Yu. "NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 16 (18 de mayo de 2021): 14006–14. http://dx.doi.org/10.1609/aaai.v35i16.17649.

Texto completo

Resumen

In this paper, we propose a Chinese multi-turn topic-driven conversation dataset, NaturalConv, which allows the participants to chat anything they want as long as any element from the topic is mentioned and the topic shift is smooth. Our corpus contains 19.9K conversations from six domains, and 400K utterances with an average turn number of 20.1. These conversations contain in-depth discussions on related topics or widely natural transition between multiple topics. We believe either way is normal for human conversation. To facilitate the research on this corpus, we provide results of several benchmark models. Comparative results show that for this dataset, our current models are not able to provide significant improvement by introducing background knowledge/topic. Therefore, the proposed dataset should be a good benchmark for further research to evaluate the validity and naturalness of multi-turn conversation systems. Our dataset is available at https://ai.tencent.com/ailab/nlp/dialogue/#datasets.

Los estilos APA, Harvard, Vancouver, ISO, etc.

39

Huch, Sebastian y Markus Lienkamp. "Towards Minimizing the LiDAR Sim-to-Real Domain Shift: Object-Level Local Domain Adaptation for 3D Point Clouds of Autonomous Vehicles". Sensors 23, n.º 24 (18 de diciembre de 2023): 9913. http://dx.doi.org/10.3390/s23249913.

Texto completo

Resumen

Perception algorithms for autonomous vehicles demand large, labeled datasets. Real-world data acquisition and annotation costs are high, making synthetic data from simulation a cost-effective option. However, training on one source domain and testing on a target domain can cause a domain shift attributed to local structure differences, resulting in a decrease in the model’s performance. We propose a novel domain adaptation approach to address this challenge and to minimize the domain shift between simulated and real-world LiDAR data. Our approach adapts 3D point clouds on the object level by learning the local characteristics of the target domain. A key feature involves downsampling to ensure domain invariance of the input data. The network comprises a state-of-the-art point completion network combined with a discriminator to guide training in an adversarial manner. We quantify the reduction in domain shift by training object detectors with the source, target, and adapted datasets. Our method successfully reduces the sim-to-real domain shift in a distribution-aligned dataset by almost 50%, from 8.63% to 4.36% 3D average precision. It is trained exclusively using target data, making it scalable and applicable to adapt point clouds from any source domain.

Los estilos APA, Harvard, Vancouver, ISO, etc.

40

Othman, Walaa, Alexey Kashevnik, Ammar Ali y Nikolay Shilov. "DriverMVT: In-Cabin Dataset for Driver Monitoring including Video and Vehicle Telemetry Information". Data 7, n.º 5 (11 de mayo de 2022): 62. http://dx.doi.org/10.3390/data7050062.

Texto completo

Resumen

Developing a driver monitoring system that can assess the driver’s state is a prerequisite and a key to improving the road safety. With the success of deep learning, such systems can achieve a high accuracy if corresponding high-quality datasets are available. In this paper, we introduce DriverMVT (Driver Monitoring dataset with Videos and Telemetry). The dataset contains information about the driver head pose, heart rate, and driver behaviour inside the cabin like drowsiness and unfastened belt. This dataset can be used to train and evaluate deep learning models to estimate the driver’s health state, mental state, concentration level, and his/her activity in the cabin. Developing such systems that can alert the driver in case of drowsiness or distraction can reduce the number of accidents and increase the safety on the road. The dataset contains 1506 videos for 9 different drivers (7 males and 2 females) with total number of frames equal 5119k and total time over 36 h. In addition, evaluated the dataset with multi-task temporal shift convolutional attention network (MTTS-CAN) algorithm. The algorithm mean average error on our dataset is 16.375 heartbeats per minute.

Los estilos APA, Harvard, Vancouver, ISO, etc.

41

Ishihara, Kazuaki y Koutarou Matsumoto. "Comparing the Robustness of ResNet, Swin-Transformer, and MLP-Mixer under Unique Distribution Shifts in Fundus Images". Bioengineering 10, n.º 12 (1 de diciembre de 2023): 1383. http://dx.doi.org/10.3390/bioengineering10121383.

Texto completo

Resumen

Background: Diabetic retinopathy (DR) is the leading cause of visual impairment and blindness. Consequently, numerous deep learning models have been developed for the early detection of DR. Safety-critical applications employed in medical diagnosis must be robust to distribution shifts. Previous studies have focused on model performance under distribution shifts using natural image datasets such as ImageNet, CIFAR-10, and SVHN. However, there is a lack of research specifically investigating the performance using medical image datasets. To address this gap, we investigated trends under distribution shifts using fundus image datasets. Methods: We used the EyePACS dataset for DR diagnosis, introduced noise specific to fundus images, and evaluated the performance of ResNet, Swin-Transformer, and MLP-Mixer models under a distribution shift. The discriminative ability was evaluated using the Area Under the Receiver Operating Characteristic curve (ROC-AUC), while the calibration ability was evaluated using the monotonic sweep calibration error (ECE sweep). Results: Swin-Transformer exhibited a higher ROC-AUC than ResNet under all types of noise and displayed a smaller reduction in the ROC-AUC due to noise. ECE sweep did not show a consistent trend across different model architectures. Conclusions: Swin-Transformer consistently demonstrated superior discrimination compared to ResNet. This trend persisted even under unique distribution shifts in the fundus images.

Los estilos APA, Harvard, Vancouver, ISO, etc.

42

Takahashi, Satoshi, Masamichi Takahashi, Manabu Kinoshita, Mototaka Miyake, Jun Sese, Kazuma Kobayashi, Koichi Ichimura, Yoshitaka Narita, Ryuji Hamamoto y Consortium of Molecular Diagnosis of glioma. "RBIO-03. INITIAL RESULT OF DEVELOP ROBUST DEEP LEARNING MODEL FOR DETECTING GENOMIC STATUS IN GLIOMAS AGAINST IMAGE DIFFERENCES AMONG FACILITIES". Neuro-Oncology 23, Supplement_6 (2 de noviembre de 2021): vi192. http://dx.doi.org/10.1093/neuonc/noab196.760.

Texto completo

Resumen

Abstract BACKGROUND The importance of detecting the genomic status of gliomas is increasingly recognized and IDH (isocitrate dehydrogenase) mutation and TERT (telomerase reverse transcriptase) promoter mutation have a significant impact on treatment decisions. Noninvasive prediction of these genomic statuses in gliomas is a challenging problem; however, a deep learning model using magnetic resonance imaging (MRI) can be a solution. The image differences among facilities causing performance degradation, called domain shift, have also been reported in other tasks such as brain tumor segmentation. We investigated whether a deep learning model could predict the gene status, and if so, to what extent it would be affected by domain shift. METHOD We used the Multimodal Brain Tumor Segmentation Challenge (BraTS) data and the Japanese cohort (JC) dataset consisted of brain tumor images collected from 544 patients in 10 facilities in Japan. We focused on IDH mutation and TERT promoter mutation. The deep learning models to predict the statuses of these genes were trained by the BraTS dataset or the training portion of the JC dataset, and the test portion of the JC dataset evaluated the accuracy of the models. RESULTS The IDH mutation predicting model trained by the BraTS dataset showed 80.0% accuracy for the validation portion of the BraTS dataset; however, only 67.3% for the test portion of the JC dataset. The TERT promoter mutation predicting model trained by the training portion of the JC dataset showed only 49% accuracy for the test portion of the JC dataset. CONCLUSION IDH mutation can be predicted by deep learning models using MRI, but the performance degeneration by domain shift was significant. On the other hand, TERT promoter mutation could not be predicted accurately enough by current deep learning techniques. In both mutations, further studies are needed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

43

Allen, Robert C., Mattia C. Bertazzini y Leander Heldring. "The Economic Origins of Government". American Economic Review 113, n.º 10 (1 de octubre de 2023): 2507–45. http://dx.doi.org/10.1257/aer.20201919.

Texto completo

Resumen

We test between cooperative and extractive theories of the origins of government. We use river shifts in southern Iraq as a natural experiment, in a new archeological panel dataset. A shift away creates a local demand for a government to coordinate because private river irrigation needs to be replaced with public canals. It disincentivizes local extraction as land is no longer productive without irrigation. Consistent with a cooperative theory of government, a river shift away led to state formation, canal construction, and the payment of tribute. We argue that the first governments coordinated between extended households which implemented public good provision. (JEL D72, H11, H41, N45, N55, Q15)

Los estilos APA, Harvard, Vancouver, ISO, etc.

44

Wu, Teng, Bruno Vallet, Marc Pierrot-Deseilligny y Ewelina Rupnik. "An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset". International Journal of Applied Earth Observation and Geoinformation 128 (abril de 2024): 103715. http://dx.doi.org/10.1016/j.jag.2024.103715.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

45

Asopa, U., S. Kumar y P. K. Thakur. "PSInSAR Study of Lyngenfjord Norway, using TerraSAR-X Data". ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-5 (15 de noviembre de 2018): 245–51. http://dx.doi.org/10.5194/isprs-annals-iv-5-245-2018.

Texto completo

Resumen

<p><strong>Abstract.</strong> In this research paper, focus is given on exploring the potential of Persistent Scatterer Interferometric Synthetic Aperture Radar (PSInSAR) technique for the measurement of landslide, which is the extension of existing DInSAR technique. In PSInSAR technique, the movement is measured by finding the phase shift in the scatterers present in the study area through the course of time. The backscattering of such a scatterer does not change during the study. By using this technique, 32 datasets acquired over a period of time i.e. from 2009 to 2011 over the area of Troms County of Lyngen Fjord, Norway are analysed. The dataset utilised are acquired with TerraSAR-X and TanDEM-X pair, in Stripmap mode of acquisition. Coregistration of dataset with subpixel accuracy is done with master images is done to align all the dataset correctly. APS estimation is done in order to remove the phase decorrelation caused by the atmosphere, movement, etc. using algorithms for phase unwrapping which allowed the processing of sparse data and the effect of atmosphere is reduced by doing analysis on temporal basis of the phase shift in interferograms of successive datasets. By this study it has been tried to show the estimation of shift can be done by the temporal analysis of the data acquired by TerraSAR-X. The velocity output is displayed in a map reflecting the velocity of movement. Apart from this, the data properties such as baseline distribution both temporal and spatial are displayed in a chart. Other outputs obtained are the atmospheric Phase Screen, sparse point distribution, reflectivity map of the study area etc. are displayed using a map of terrain. The output velocity obtained of the terrain movement is found to be in the range of &minus;40<span class="thinspace"></span>mm/yr to &minus;70<span class="thinspace"></span>mm/yr.</p>

Los estilos APA, Harvard, Vancouver, ISO, etc.

46

Tang, Yansong, Xingyu Liu, Xumin Yu, Danyang Zhang, Jiwen Lu y Jie Zhou. "Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition". ACM Transactions on Multimedia Computing, Communications, and Applications 18, n.º 2 (31 de mayo de 2022): 1–24. http://dx.doi.org/10.1145/3472722.

Texto completo

Resumen

Rapid progress and superior performance have been achieved for skeleton-based action recognition recently. In this article, we investigate this problem under a cross-dataset setting, which is a new, pragmatic, and challenging task in real-world scenarios. Following the unsupervised domain adaptation (UDA) paradigm, the action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. Different from the conventional adversarial learning-based approaches for UDA, we utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. Our inspiration is drawn from Cubism, an art genre from the early 20th century, which breaks and reassembles the objects to convey a greater context. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks to explore the temporal and spatial dependency of a skeleton-based action and improve the generalization ability of the model. We conduct experiments on six datasets for skeleton-based action recognition, including three large-scale datasets (NTU RGB+D, PKU-MMD, and Kinetics) where new cross-dataset settings and benchmarks are established. Extensive results demonstrate that our method outperforms state-of-the-art approaches. The source codes of our model and all the compared methods are available at https://github.com/shanice-l/st-cubism.

Los estilos APA, Harvard, Vancouver, ISO, etc.

47

Guentchev, Galina, Joseph J. Barsugli y Jon Eischeid. "Homogeneity of Gridded Precipitation Datasets for the Colorado River Basin". Journal of Applied Meteorology and Climatology 49, n.º 12 (1 de diciembre de 2010): 2404–15. http://dx.doi.org/10.1175/2010jamc2484.1.

Texto completo

Resumen

Abstract Inhomogeneity in gridded meteorological data may arise from the inclusion of inhomogeneous station data or from aspects of the gridding procedure itself. However, the homogeneity of gridded datasets is rarely questioned, even though an analysis of trends or variability that uses inhomogeneous data could be misleading or even erroneous. Three gridded precipitation datasets that have been used in studies of the Upper Colorado River basin were tested for homogeneity in this study: that of Maurer et al., that of Beyene and Lettenmaier, and the Parameter–Elevation Regressions on Independent Slopes Model (PRISM) dataset of Daly et al. Four absolute homogeneity tests were applied to annual precipitation amounts on a grid cell and on a hydrologic subregion spatial scale for the periods 1950–99 and 1916–2006. The analysis detects breakpoints in 1977 and 1978 at many locations in all three datasets that may be due to an anomalously rapid shift in the Pacific decadal oscillation. One dataset showed breakpoints in the 1940s that might be due to the widespread change in the number of available observing stations used as input for that dataset. The results also indicated that the time series from the three datasets are sufficiently homogeneous for variability analysis during the 1950–99 period when aggregated on a subregional scale.

Los estilos APA, Harvard, Vancouver, ISO, etc.

48

Sime, Louise C., Richard C. A. Hindmarsh y Hugh Corr. "Automated processing to derive dip angles of englacial radar reflectors in ice sheets". Journal of Glaciology 57, n.º 202 (2011): 260–66. http://dx.doi.org/10.3189/002214311796405870.

Texto completo

Resumen

AbstractWe present a novel automated processing method for obtaining layer dip from radio-echo sounding (RES) data. The method is robust, easily applicable and can be used to process large (several terabytes) ground and airborne RES datasets using modest computing resources. We give test results from the application of the method to two Antarctic datasets: the Fletcher Promontory ground-based radar dataset and the Wilkes Subglacial Basin airborne radar dataset. The automated RES processing (ARESP) method comprises the basic steps: (1) RES noise reduction; (2) radar layer identification; (3) isolation of individual ‘layer objects’; (4) measurement of orientation and other object properties; (5) elimination of noise in the orientation data; and (6) collation of the valid dip information. The apparent dip datasets produced by the method will aid glaciologists seeking to understand ice-flow dynamics in Greenland and Antarctica: ARESP could enable a shift from selective regional case studies to ice-sheet-scale studies.

Los estilos APA, Harvard, Vancouver, ISO, etc.

49

Sharif, Muhammad Imran, Muhammad Attique Khan, Abdullah Alqahtani, Muhammad Nazir, Shtwai Alsubai, Adel Binbusayyis y Robertas Damaševičius. "Deep Learning and Kurtosis-Controlled, Entropy-Based Framework for Human Gait Recognition Using Video Sequences". Electronics 11, n.º 3 (21 de enero de 2022): 334. http://dx.doi.org/10.3390/electronics11030334.

Texto completo

Resumen

Gait is commonly defined as the movement pattern of the limbs over a hard substrate, and it serves as a source of identification information for various computer-vision and image-understanding techniques. A variety of parameters, such as human clothing, angle shift, walking style, occlusion, and so on, have a significant impact on gait-recognition systems, making the scene quite complex to handle. In this article, we propose a system that effectively handles problems associated with viewing angle shifts and walking styles in a real-time environment. The following steps are included in the proposed novel framework: (a) real-time video capture, (b) feature extraction using transfer learning on the ResNet101 deep model, and (c) feature selection using the proposed kurtosis-controlled entropy (KcE) approach, followed by a correlation-based feature fusion step. The most discriminant features are then classified using the most advanced machine learning classifiers. The simulation process is fed by the CASIA B dataset as well as a real-time captured dataset. On selected datasets, the accuracy is 95.26% and 96.60%, respectively. When compared to several known techniques, the results show that our proposed framework outperforms them all.

Los estilos APA, Harvard, Vancouver, ISO, etc.

50

Hidalgo Davila, Mateo, Maria Baldeon-Calisto, Juan Jose Murillo, Bernardo Puente-Mejia, Danny Navarrete, Daniel Riofrío, Noel Peréz, Diego S. Benítez y Ricardo Flores Moyano. "Analyzing the Effect of Basic Data Augmentation for COVID-19 Detection through a Fractional Factorial Experimental Design". Emerging Science Journal 7 (24 de septiembre de 2022): 1–16. http://dx.doi.org/10.28991/esj-2023-sper-01.

Texto completo

Resumen

The COVID-19 pandemic has created a worldwide healthcare crisis. Convolutional Neural Networks (CNNs) have recently been used with encouraging results to help detect COVID-19 from chest X-ray images. However, to generalize well to unseen data, CNNs require large labeled datasets. Due to the lack of publicly available COVID-19 datasets, most CNNs apply various data augmentation techniques during training. However, there has not been a thorough statistical analysis of how data augmentation operations affect classification performance for COVID-19 detection. In this study, a fractional factorial experimental design is used to examine the impact of basic augmentation methods on COVID-19 detection. The latter enables identifying which particular data augmentation techniques and interactions have a statistically significant impact on the classification performance, whether positively or negatively. Using the CoroNet architecture and two publicly available COVID-19 datasets, the most common basic augmentation methods in the literature are evaluated. The results of the experiments demonstrate that the methods of zoom, range, and height shift positively impact the model's accuracy in dataset 1. The performance of dataset 2 is unaffected by any of the data augmentation operations. Additionally, a new state-of-the-art performance is achieved on both datasets by training CoroNet with the ideal data augmentation values found using the experimental design. Specifically, in dataset 1, 97% accuracy, 93% precision, and 97.7% recall were attained, while in dataset 2, 97% accuracy, 97% precision, and 97.6% recall were achieved. These results indicate that analyzing the effects of data augmentations on a particular task and dataset is essential for the best performance. Doi: 10.28991/ESJ-2023-SPER-01 Full Text: PDF

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!