Log in

Relevant bibliographies by topics / Dataset shift / Journal articles

To see the other types of publications on this topic, follow the link: Dataset shift.

Journal articles on the topic 'Dataset shift'

Author: Grafiati

Published: 25 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Dataset shift.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Sharet, Nir, and Ilan Shimshoni. "Analyzing Data Changes using Mean Shift Clustering." International Journal of Pattern Recognition and Artificial Intelligence 30, no. 07 (May 25, 2016): 1650016. http://dx.doi.org/10.1142/s0218001416500166.

Full text

Abstract:

A nonparametric unsupervised method for analyzing changes in complex datasets is proposed. It is based on the mean shift clustering algorithm. Mean shift is used to cluster the old and new datasets and compare the results in a nonparametric manner. Each point from the new dataset naturally belongs to a cluster of points from its dataset. The method is also able to find to which cluster the point belongs in the old dataset and use this information to report qualitative differences between that dataset and the new one. Changes in local cluster distribution are also reported. The report can then be used to try to understand the underlying reasons which caused the changes in the distributions. On the basis of this method, a transductive transfer learning method for automatically labeling data from the new dataset is also proposed. This labeled data is used, in addition to the old training set, to train a classifier better suited to the new dataset. The algorithm has been implemented and tested on simulated and real (a stereo image pair) datasets. Its performance was also compared with several state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

2

Adams, Niall. "Dataset Shift in Machine Learning." Journal of the Royal Statistical Society: Series A (Statistics in Society) 173, no. 1 (January 2010): 274. http://dx.doi.org/10.1111/j.1467-985x.2009.00624_10.x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Guo, Lin Lawrence, Stephen R. Pfohl, Jason Fries, Jose Posada, Scott Lanyon Fleming, Catherine Aftandilian, Nigam Shah, and Lillian Sung. "Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine." Applied Clinical Informatics 12, no. 04 (August 2021): 808–15. http://dx.doi.org/10.1055/s-0041-1735184.

Full text

Abstract:

Abstract Objective The change in performance of machine learning models over time as a result of temporal dataset shift is a barrier to machine learning-derived models facilitating decision-making in clinical practice. Our aim was to describe technical procedures used to preserve the performance of machine learning models in the presence of temporal dataset shifts. Methods Studies were included if they were fully published articles that used machine learning and implemented a procedure to mitigate the effects of temporal dataset shift in a clinical setting. We described how dataset shift was measured, the procedures used to preserve model performance, and their effects. Results Of 4,457 potentially relevant publications identified, 15 were included. The impact of temporal dataset shift was primarily quantified using changes, usually deterioration, in calibration or discrimination. Calibration deterioration was more common (n = 11) than discrimination deterioration (n = 3). Mitigation strategies were categorized as model level or feature level. Model-level approaches (n = 15) were more common than feature-level approaches (n = 2), with the most common approaches being model refitting (n = 12), probability calibration (n = 7), model updating (n = 6), and model selection (n = 6). In general, all mitigation strategies were successful at preserving calibration but not uniformly successful in preserving discrimination. Conclusion There was limited research in preserving the performance of machine learning models in the presence of temporal dataset shift in clinical medicine. Future research could focus on the impact of dataset shift on clinical decision making, benchmark the mitigation strategies on a wider range of datasets and tasks, and identify optimal strategies for specific settings.

APA, Harvard, Vancouver, ISO, and other styles

4

He, Zhiqiang. "ECG Heartbeat Classification Under Dataset Shift." Journal of Intelligent Medicine and Healthcare 1, no. 2 (2022): 79–89. http://dx.doi.org/10.32604/jimh.2022.036624.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Kim, Doyoung, Inwoong Lee, Dohyung Kim, and Sanghoon Lee. "Action Recognition Using Close-Up of Maximum Activation and ETRI-Activity3D LivingLab Dataset." Sensors 21, no. 20 (October 12, 2021): 6774. http://dx.doi.org/10.3390/s21206774.

Full text

Abstract:

The development of action recognition models has shown great performance on various video datasets. Nevertheless, because there is no rich data on target actions in existing datasets, it is insufficient to perform action recognition applications required by industries. To satisfy this requirement, datasets composed of target actions with high availability have been created, but it is difficult to capture various characteristics in actual environments because video data are generated in a specific environment. In this paper, we introduce a new ETRI-Activity3D-LivingLab dataset, which provides action sequences in actual environments and helps to handle a network generalization issue due to the dataset shift. When the action recognition model is trained on the ETRI-Activity3D and KIST SynADL datasets and evaluated on the ETRI-Activity3D-LivingLab dataset, the performance can be severely degraded because the datasets were captured in different environments domains. To reduce this dataset shift between training and testing datasets, we propose a close-up of maximum activation, which magnifies the most activated part of a video input in detail. In addition, we present various experimental results and analysis that show the dataset shift and demonstrate the effectiveness of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

6

McGaughey, Georgia, W. Patrick Walters, and Brian Goldman. "Understanding covariate shift in model performance." F1000Research 5 (April 7, 2016): 597. http://dx.doi.org/10.12688/f1000research.8317.1.

Full text

Abstract:

Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.

APA, Harvard, Vancouver, ISO, and other styles

7

McGaughey, Georgia, W. Patrick Walters, and Brian Goldman. "Understanding covariate shift in model performance." F1000Research 5 (June 17, 2016): 597. http://dx.doi.org/10.12688/f1000research.8317.2.

Full text

Abstract:

Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.

APA, Harvard, Vancouver, ISO, and other styles

8

McGaughey, Georgia, W. Patrick Walters, and Brian Goldman. "Understanding covariate shift in model performance." F1000Research 5 (October 17, 2016): 597. http://dx.doi.org/10.12688/f1000research.8317.3.

Full text

Abstract:

Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets.

APA, Harvard, Vancouver, ISO, and other styles

9

Becker, Aneta, and Jarosław Becker. "Dataset shift assessment measures in monitoring predictive models." Procedia Computer Science 192 (2021): 3391–402. http://dx.doi.org/10.1016/j.procs.2021.09.112.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Finlayson, Samuel G., Adarsh Subbaswamy, Karandeep Singh, John Bowers, Annabel Kupke, Jonathan Zittrain, Isaac S. Kohane, and Suchi Saria. "The Clinician and Dataset Shift in Artificial Intelligence." New England Journal of Medicine 385, no. 3 (July 15, 2021): 283–86. http://dx.doi.org/10.1056/nejmc2104626.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Moreno-Torres, Jose G., Troy Raeder, Rocío Alaiz-Rodríguez, Nitesh V. Chawla, and Francisco Herrera. "A unifying view on dataset shift in classification." Pattern Recognition 45, no. 1 (January 2012): 521–30. http://dx.doi.org/10.1016/j.patcog.2011.06.019.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Subbaswamy, Adarsh, Bryant Chen, and Suchi Saria. "A unifying causal framework for analyzing dataset shift-stable learning algorithms." Journal of Causal Inference 10, no. 1 (January 1, 2022): 64–89. http://dx.doi.org/10.1515/jci-2021-0042.

Full text

Abstract:

Abstract Recent interest in the external validity of prediction models (i.e., the problem of different train and test distributions, known as dataset shift) has produced many methods for finding predictive distributions that are invariant to dataset shifts and can be used for prediction in new, unseen environments. However, these methods consider different types of shifts and have been developed under disparate frameworks, making it difficult to theoretically analyze how solutions differ with respect to stability and accuracy. Taking a causal graphical view, we use a flexible graphical representation to express various types of dataset shifts. Given a known graph of the data generating process, we show that all invariant distributions correspond to a causal hierarchy of graphical operators, which disable the edges in the graph that are responsible for the shifts. The hierarchy provides a common theoretical underpinning for understanding when and how stability to shifts can be achieved, and in what ways stable distributions can differ. We use it to establish conditions for minimax optimal performance across environments, and derive new algorithms that find optimal stable distributions. By using this new perspective, we empirically demonstrate that that there is a tradeoff between minimax and average performance.

APA, Harvard, Vancouver, ISO, and other styles

13

Xie, Y., K. Schindler, J. Tian, and X. X. Zhu. "EXPLORING CROSS-CITY SEMANTIC SEGMENTATION OF ALS POINT CLOUDS." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B2-2021 (June 28, 2021): 247–54. http://dx.doi.org/10.5194/isprs-archives-xliii-b2-2021-247-2021.

Full text

Abstract:

Abstract. Deep learning models achieve excellent semantic segmentation results for airborne laser scanning (ALS) point clouds, if sufficient training data are provided. Increasing amounts of annotated data are becoming publicly available thanks to contributors from all over the world. However, models trained on a specific dataset typically exhibit poor performance on other datasets. I.e., there are significant domain shifts, as data captured in different environments or by distinct sensors have different distributions. In this work, we study this domain shift and potential strategies to mitigate it, using two popular ALS datasets: the ISPRS Vaihingen benchmark from Germany and the LASDU benchmark from China. We compare different training strategies for cross-city ALS point cloud semantic segmentation. In our experiments, we analyse three factors that may lead to domain shift and affect the learning: point cloud density, LiDAR intensity, and the role of data augmentation. Moreover, we evaluate a well-known standard method of domain adaptation, deep CORAL (Sun and Saenko, 2016). In our experiments, adapting the point cloud density and appropriate data augmentation both help to reduce the domain gap and improve segmentation accuracy. On the contrary, intensity features can bring an improvement within a dataset, but deteriorate the generalisation across datasets. Deep CORAL does not further improve the accuracy over the simple adaptation of density and data augmentation, although it can mitigate the impact of improperly chosen point density, intensity features, and further dataset biases like lack of diversity.

APA, Harvard, Vancouver, ISO, and other styles

14

ZHAO, YUZHONG, BABAK ALIPANAHI, SHUAI CHENG LI, and MING LI. "PROTEIN SECONDARY STRUCTURE PREDICTION USING NMR CHEMICAL SHIFT DATA." Journal of Bioinformatics and Computational Biology 08, no. 05 (October 2010): 867–84. http://dx.doi.org/10.1142/s0219720010004987.

Full text

Abstract:

Accurate determination of protein secondary structure from the chemical shift information is a key step for NMR tertiary structure determination. Relatively few work has been done on this subject. There needs to be a systematic investigation of algorithms that are (a) robust for large datasets; (b) easily extendable to (the dynamic) new databases; and (c) approaching to the limit of accuracy. We introduce new approaches using k-nearest neighbor algorithm to do the basic prediction and use the BCJR algorithm to smooth the predictions and combine different predictions from chemical shifts and based on sequence information only. Our new system, SUCCES, improves the accuracy of all existing methods on a large dataset of 805 proteins (at 86% Q3 accuracy and at 92.6% accuracy when the boundary residues are ignored), and it is easily extendable to any new dataset without requiring any new training. The software is publicly available at .

APA, Harvard, Vancouver, ISO, and other styles

15

Chakraborty, Saptarshi, Debolina Paul, and Swagatam Das. "Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 8 (May 18, 2021): 6930–38. http://dx.doi.org/10.1609/aaai.v35i8.16854.

Full text

Abstract:

Mean shift is a simple interactive procedure that gradually shifts data points towards the mode which denotes the highest density of data points in the region. Mean shift algorithms have been effectively used for data denoising, mode seeking, and finding the number of clusters in a dataset in an automated fashion. However, the merits of mean shift quickly fade away as the data dimensions increase and only a handful of features contain useful information about the cluster structure of the data. We propose a simple yet elegant feature-weighted variant of mean shift to efficiently learn the feature importance and thus, extending the merits of mean shift to high-dimensional data. The resulting algorithm not only outperforms the conventional mean shift clustering procedure but also preserves its computational simplicity. In addition, the proposed method comes with rigorous theoretical convergence guarantees and a convergence rate of at least a cubic order. The efficacy of our proposal is thoroughly assessed through experimental comparison against baseline and state-of-the-art clustering methods on synthetic as well as real-world datasets.

APA, Harvard, Vancouver, ISO, and other styles

16

Tasche, Dirk. "Factorizable Joint Shift in Multinomial Classification." Machine Learning and Knowledge Extraction 4, no. 3 (September 10, 2022): 779–802. http://dx.doi.org/10.3390/make4030038.

Full text

Abstract:

Factorizable joint shift (FJS) was recently proposed as a type of dataset shift for which the complete characteristics can be estimated from feature data observations on the test dataset by a method called Joint Importance Aligning. For the multinomial (multiclass) classification setting, we derive a representation of factorizable joint shift in terms of the source (training) distribution, the target (test) prior class probabilities and the target marginal distribution of the features. On the basis of this result, we propose alternatives to joint importance aligning and, at the same time, point out that factorizable joint shift is not fully identifiable if no class label information on the test dataset is available and no additional assumptions are made. Other results of the paper include correction formulae for the posterior class probabilities both under general dataset shift and factorizable joint shift. In addition, we investigate the consequences of assuming factorizable joint shift for the bias caused by sample selection.

APA, Harvard, Vancouver, ISO, and other styles

17

Xue, Zhiyun, Feng Yang, Sivaramakrishnan Rajaraman, Ghada Zamzmi, and Sameer Antani. "Cross Dataset Analysis of Domain Shift in CXR Lung Region Detection." Diagnostics 13, no. 6 (March 11, 2023): 1068. http://dx.doi.org/10.3390/diagnostics13061068.

Full text

Abstract:

Domain shift is one of the key challenges affecting reliability in medical imaging-based machine learning predictions. It is of significant importance to investigate this issue to gain insights into its characteristics toward determining controllable parameters to minimize its impact. In this paper, we report our efforts on studying and analyzing domain shift in lung region detection in chest radiographs. We used five chest X-ray datasets, collected from different sources, which have manual markings of lung boundaries in order to conduct extensive experiments toward this goal. We compared the characteristics of these datasets from three aspects: information obtained from metadata or an image header, image appearance, and features extracted from a pretrained model. We carried out experiments to evaluate and compare model performances within each dataset and across datasets in four scenarios using different combinations of datasets. We proposed a new feature visualization method to provide explanations for the applied object detection network on the obtained quantitative results. We also examined chest X-ray modality-specific initialization, catastrophic forgetting, and model repeatability. We believe the observations and discussions presented in this work could help to shed some light on the importance of the analysis of training data for medical imaging machine learning research, and could provide valuable guidance for domain shift analysis.

APA, Harvard, Vancouver, ISO, and other styles

18

Sáez, José A., and José L. Romero-Béjar. "Impact of Regressand Stratification in Dataset Shift Caused by Cross-Validation." Mathematics 10, no. 14 (July 21, 2022): 2538. http://dx.doi.org/10.3390/math10142538.

Full text

Abstract:

Data that have not been modeled cannot be correctly predicted. Under this assumption, this research studies how k-fold cross-validation can introduce dataset shift in regression problems. This fact implies data distributions in the training and test sets to be different and, therefore, a deterioration of the model performance estimation. Even though the stratification of the output variable is widely used in the field of classification to reduce the impacts of dataset shift induced by cross-validation, its use in regression is not widespread in the literature. This paper analyzes the consequences for dataset shift of including different regressand stratification schemes in cross-validation with regression data. The results obtained show that these allow for creating more similar training and test sets, reducing the presence of dataset shift related to cross-validation. The bias and deviation of the performance estimation results obtained by regression algorithms are improved using the highest amounts of strata, as are the number of cross-validation repetitions necessary to obtain these better results.

APA, Harvard, Vancouver, ISO, and other styles

19

Turhan, Burak. "On the dataset shift problem in software engineering prediction models." Empirical Software Engineering 17, no. 1-2 (October 12, 2011): 62–74. http://dx.doi.org/10.1007/s10664-011-9182-8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Becker, Jarosław, and Aneta Becker. "Predictive Accuracy Index in evaluating the dataset shift (case study)." Procedia Computer Science 225 (2023): 3342–51. http://dx.doi.org/10.1016/j.procs.2023.10.328.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Aryal, Jagannath, and Bipul Neupane. "Multi-Scale Feature Map Aggregation and Supervised Domain Adaptation of Fully Convolutional Networks for Urban Building Footprint Extraction." Remote Sensing 15, no. 2 (January 13, 2023): 488. http://dx.doi.org/10.3390/rs15020488.

Full text

Abstract:

Automated building footprint extraction requires the Deep Learning (DL)-based semantic segmentation of high-resolution Earth observation images. Fully convolutional networks (FCNs) such as U-Net and ResUNET are widely used for such segmentation. The evolving FCNs suffer from the inadequate use of multi-scale feature maps in their backbone of convolutional neural networks (CNNs). Furthermore, the DL methods are not robust in cross-domain settings due to domain-shift problems. Two scale-robust novel networks, namely MSA-UNET and MSA-ResUNET, are developed in this study by aggregating the multi-scale feature maps in U-Net and ResUNET with partial concepts of the feature pyramid network (FPN). Furthermore, supervised domain adaptation is investigated to minimise the effects of domain-shift between the two datasets. The datasets include the benchmark WHU Building dataset and a developed dataset with 5× fewer samples, 4× lower spatial resolution and complex high-rise buildings and skyscrapers. The newly developed networks are compared to six state-of-the-art FCNs using five metrics: pixel accuracy, adjusted accuracy, F1 score, intersection over union (IoU), and the Matthews Correlation Coefficient (MCC). The proposed networks outperform the FCNs in the majority of the accuracy measures in both datasets. Compared to the larger dataset, the network trained on the smaller one shows significantly higher robustness in terms of adjusted accuracy (by 18%), F1 score (by 31%), IoU (by 27%), and MCC (by 29%) during the cross-domain validation of MSA-UNET. MSA-ResUNET shows similar improvements, concluding that the proposed networks when trained using domain adaptation increase the robustness and minimise the domain-shift between the datasets of different complexity.

APA, Harvard, Vancouver, ISO, and other styles

22

Peng, Zhiyong, Changlin Han, Yadong Liu, and Zongtan Zhou. "Weighted Policy Constraints for Offline Reinforcement Learning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 8 (June 26, 2023): 9435–43. http://dx.doi.org/10.1609/aaai.v37i8.26130.

Full text

Abstract:

Offline reinforcement learning (RL) aims to learn policy from the passively collected offline dataset. Applying existing RL methods on the static dataset straightforwardly will raise distribution shift, causing these unconstrained RL methods to fail. To cope with the distribution shift problem, a common practice in offline RL is to constrain the policy explicitly or implicitly close to behavioral policy. However, the available dataset usually contains sub-optimal or inferior actions, constraining the policy near all these actions will make the policy inevitably learn inferior behaviors, limiting the performance of the algorithm. Based on this observation, we propose a weighted policy constraints (wPC) method that only constrains the learned policy to desirable behaviors, making room for policy improvement on other parts. Our algorithm outperforms existing state-of-the-art offline RL algorithms on the D4RL offline gym datasets. Moreover, the proposed algorithm is simple to implement with few hyper-parameters, making the proposed wPC algorithm a robust offline RL method with low computational complexity.

APA, Harvard, Vancouver, ISO, and other styles

23

Phongsasiri, Siriwan, and Suwanna Rasmequan. "Outlier Detection in Wellness Data using Probabilistic Mapped Mean-Shift Algorithms." ECTI Transactions on Computer and Information Technology (ECTI-CIT) 15, no. 2 (August 11, 2021): 258–66. http://dx.doi.org/10.37936/ecti-cit.2021152.244971.

Full text

Abstract:

In this paper, the Probabilistic Mapped Mean-Shift Algorithm is proposed to detect anomalous data in public datasets and local hospital children’s wellness clinic databases. The proposed framework consists of two main parts. First, the Probabilistic Mapping step consists of k-NN instance acquisition, data distribution calculation, and data point reposition. Truncated Gaussian Distribution (TGD) was used for controlling the boundary of the mapped points. Second, the Outlier Detection step consists of outlier score calculation and outlier selection. Experimental results show that the proposed algorithm outperformed the existing algorithms with real-world benchmark datasets and a Children’s Wellness Clinic dataset (CWD). Outlier detection accuracy obtained from the proposed algorithm based on Wellness, Stamps, Arrhythmia, Pima, and Parkinson datasets was 93%, 94%, 80%, 75%, and 72%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

24

Rodriguez-Vazquez, Javier, Miguel Fernandez-Cortizas, David Perez-Saura, Martin Molina, and Pascual Campoy. "Overcoming Domain Shift in Neural Networks for Accurate Plant Counting in Aerial Images." Remote Sensing 15, no. 6 (March 22, 2023): 1700. http://dx.doi.org/10.3390/rs15061700.

Full text

Abstract:

This paper presents a novel semi-supervised approach for accurate counting and localization of tropical plants in aerial images that can work in new visual domains in which the available data are not labeled. Our approach uses deep learning and domain adaptation, designed to handle domain shifts between the training and test data, which is a common challenge in this agricultural applications. This method uses a source dataset with annotated plants and a target dataset without annotations and adapts a model trained on the source dataset to the target dataset using unsupervised domain alignment and pseudolabeling. The experimental results show the effectiveness of this approach for plant counting in aerial images of pineapples under significative domain shift, achieving a reduction up to 97% in the counting error (1.42 in absolute count) when compared to the supervised baseline (48.6 in absolute count).

APA, Harvard, Vancouver, ISO, and other styles

25

Tappy, Nicolas, Anna Fontcuberta i Morral, and Christian Monachon. "Image shift correction, noise analysis, and model fitting of (cathodo-)luminescence hyperspectral maps." Review of Scientific Instruments 93, no. 5 (May 1, 2022): 053702. http://dx.doi.org/10.1063/5.0080486.

Full text

Abstract:

Hyperspectral imaging is an important asset of modern spectroscopy. It allows us to perform optical metrology at a high spatial resolution, for example in cathodoluminescence in scanning electron microscopy. However, hyperspectral datasets present added challenges in their analysis compared to individually taken spectra due to their lower signal to noise ratio and specific aberrations. On the other hand, the large volume of information in a hyperspectral dataset allows the application of advanced statistical analysis methods derived from machine-learning. In this article, we present a methodology to perform model fitting on hyperspectral maps, leveraging principal component analysis to perform a thorough noise analysis of the dataset. We explain how to correct the imaging shift artifact, specific to imaging spectroscopy, by directly evaluating it from the data. The impact of goodness-of-fit-indicators and parameter uncertainties is discussed. We provide indications on how to apply this technique to a variety of hyperspectral datasets acquired using other experimental techniques. As a practical example, we provide an implementation of this analysis using the open-source Python library hyperspy, which is implemented using the well established Jupyter Notebook framework in the scientific community.

APA, Harvard, Vancouver, ISO, and other styles

26

Wang, Li, Dong Li, Han Liu, JinZhang Peng, Lu Tian, and Yi Shan. "Cross-Dataset Collaborative Learning for Semantic Segmentation in Autonomous Driving." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 2487–94. http://dx.doi.org/10.1609/aaai.v36i3.20149.

Full text

Abstract:

Semantic segmentation is an important task for scene understanding in self-driving cars and robotics, which aims to assign dense labels for all pixels in the image. Existing work typically improves semantic segmentation performance by exploring different network architectures on a target dataset. Little attention has been paid to build a unified system by simultaneously learning from multiple datasets due to the inherent distribution shift across different datasets. In this paper, we propose a simple, flexible, and general method for semantic segmentation, termed Cross-Dataset Collaborative Learning (CDCL). Our goal is to train a unified model for improving the performance in each dataset by leveraging information from all the datasets. Specifically, we first introduce a family of Dataset-Aware Blocks (DAB) as the fundamental computing units of the network, which help capture homogeneous convolutional representations and heterogeneous statistics across different datasets. Second, we present a Dataset Alternation Training (DAT) mechanism to facilitate the collaborative optimization procedure. We conduct extensive evaluations on diverse semantic segmentation datasets for autonomous driving. Experiments demonstrate that our method consistently achieves notable improvements over prior single-dataset and cross-dataset training methods without introducing extra FLOPs. Particularly, with the same architecture of PSPNet (ResNet-18), our method outperforms the single-dataset baseline by 5.65\%, 6.57\%, and 5.79\% mIoU on the validation sets of Cityscapes, BDD100K, CamVid, respectively. We also apply CDCL for point cloud 3D semantic segmentation and achieve improved performance, which further validates the superiority and generality of our method. Code and models will be released.

APA, Harvard, Vancouver, ISO, and other styles

27

He, Yue, Xinwei Shen, Renzhe Xu, Tong Zhang, Yong Jiang, Wenchao Zou, and Peng Cui. "Covariate-Shift Generalization via Random Sample Weighting." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 10 (June 26, 2023): 11828–36. http://dx.doi.org/10.1609/aaai.v37i10.26396.

Full text

Abstract:

Shifts in the marginal distribution of covariates from training to the test phase, named covariate-shifts, often lead to unstable prediction performance across agnostic testing data, especially under model misspecification. Recent literature on invariant learning attempts to learn an invariant predictor from heterogeneous environments. However, the performance of the learned predictor depends heavily on the availability and quality of provided environments. In this paper, we propose a simple and effective non-parametric method for generating heterogeneous environments via Random Sample Weighting (RSW). Given the training dataset from a single source environment, we randomly generate a set of covariate-determining sample weights and use each weighted training distribution to simulate an environment. We theoretically show that under appropriate conditions, such random sample weighting can produce sufficient heterogeneity to be exploited by common invariance constraints to find the invariant variables for stable prediction under covariate shifts. Extensive experiments on both simulated and real-world datasets clearly validate the effectiveness of our method.

APA, Harvard, Vancouver, ISO, and other styles

28

Hong, Zhiqing, Zelong Li, Shuxin Zhong, Wenjun Lyu, Haotian Wang, Yi Ding, Tian He, and Desheng Zhang. "CrossHAR: Generalizing Cross-dataset Human Activity Recognition via Hierarchical Self-Supervised Pretraining." Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 8, no. 2 (May 13, 2024): 1–26. http://dx.doi.org/10.1145/3659597.

Full text

Abstract:

The increasing availability of low-cost wearable devices and smartphones has significantly advanced the field of sensor-based human activity recognition (HAR), attracting considerable research interest. One of the major challenges in HAR is the domain shift problem in cross-dataset activity recognition, which occurs due to variations in users, device types, and sensor placements between the source dataset and the target dataset. Although domain adaptation methods have shown promise, they typically require access to the target dataset during the training process, which might not be practical in some scenarios. To address these issues, we introduce CrossHAR, a new HAR model designed to improve model performance on unseen target datasets. CrossHAR involves three main steps: (i) CrossHAR explores the sensor data generation principle to diversify the data distribution and augment the raw sensor data. (ii) CrossHAR then employs a hierarchical self-supervised pretraining approach with the augmented data to develop a generalizable representation. (iii) Finally, CrossHAR fine-tunes the pretrained model with a small set of labeled data in the source dataset, enhancing its performance in cross-dataset HAR. Our extensive experiments across multiple real-world HAR datasets demonstrate that CrossHAR outperforms current state-of-the-art methods by 10.83% in accuracy, demonstrating its effectiveness in generalizing to unseen target datasets.

APA, Harvard, Vancouver, ISO, and other styles

29

Wei, Weiwei, Yuxuan Liao, Yufei Wang, Shaoqi Wang, Wen Du, Hongmei Lu, Bo Kong, Huawu Yang, and Zhimin Zhang. "Deep Learning-Based Method for Compound Identification in NMR Spectra of Mixtures." Molecules 27, no. 12 (June 7, 2022): 3653. http://dx.doi.org/10.3390/molecules27123653.

Full text

Abstract:

Nuclear magnetic resonance (NMR) spectroscopy is highly unbiased and reproducible, which provides us a powerful tool to analyze mixtures consisting of small molecules. However, the compound identification in NMR spectra of mixtures is highly challenging because of chemical shift variations of the same compound in different mixtures and peak overlapping among molecules. Here, we present a pseudo-Siamese convolutional neural network method (pSCNN) to identify compounds in mixtures for NMR spectroscopy. A data augmentation method was implemented for the superposition of several NMR spectra sampled from a spectral database with random noises. The augmented dataset was split and used to train, validate and test the pSCNN model. Two experimental NMR datasets (flavor mixtures and additional flavor mixture) were acquired to benchmark its performance in real applications. The results show that the proposed method can achieve good performances in the augmented test set (ACC = 99.80%, TPR = 99.70% and FPR = 0.10%), the flavor mixtures dataset (ACC = 97.62%, TPR = 96.44% and FPR = 2.29%) and the additional flavor mixture dataset (ACC = 91.67%, TPR = 100.00% and FPR = 10.53%). We have demonstrated that the translational invariance of convolutional neural networks can solve the chemical shift variation problem in NMR spectra. In summary, pSCNN is an off-the-shelf method to identify compounds in mixtures for NMR spectroscopy because of its accuracy in compound identification and robustness to chemical shift variation.

APA, Harvard, Vancouver, ISO, and other styles

30

Blanza, J., X. E. Cabasal, J. B. Cipriano, G. A. Guerrero, R. Y. Pescador, and E. V. Rivera. "Indoor Wireless Multipaths Outlier Detection and Clustering." Journal of Physics: Conference Series 2356, no. 1 (October 1, 2022): 012037. http://dx.doi.org/10.1088/1742-6596/2356/1/012037.

Full text

Abstract:

Wireless communication systems have grown and developed significantly in recent years to fulfill the growing demand for high data rates across a wireless medium. Channel models have been used to develop various sturdy wireless systems for indoor and outdoor applications, and these are simulated in the form of datasets. The presence of outliers in clusters has been a concern in datasets, as it affects the standard deviation and mean of the dataset which reduces the data accuracy. In this study, the outliers in the Cooperation in Science and Technology (COST) 2100 MIMO channel model dataset were shifted to the means of the clusters using the Mean Shift Outlier Detection method. Afterward, the data is clustered using simultaneous clustering and model selection matrix affinity (SCAMSMA). The Mean Shift Outlier Detection method identified 52 and 46 multipaths as outliers and improved the clustering accuracy of the indoor scenarios by 3.5% and 0.93%, respectively. It also increased the precision of the clustering based on the decrease in standard deviation of the Jaccard indices from 0.2435 to 0.1807 and 0.3038 to 0.2075.

APA, Harvard, Vancouver, ISO, and other styles

31

Goel, Parth, and Amit Ganatra. "Unsupervised Domain Adaptation for Image Classification and Object Detection Using Guided Transfer Learning Approach and JS Divergence." Sensors 23, no. 9 (April 30, 2023): 4436. http://dx.doi.org/10.3390/s23094436.

Full text

Abstract:

Unsupervised domain adaptation (UDA) is a transfer learning technique utilized in deep learning. UDA aims to reduce the distribution gap between labeled source and unlabeled target domains by adapting a model through fine-tuning. Typically, UDA approaches assume the same categories in both domains. The effectiveness of transfer learning depends on the degree of similarity between the domains, which determines an efficient fine-tuning strategy. Furthermore, domain-specific tasks generally perform well when the feature distributions of the domains are similar. However, utilizing a trained source model directly in the target domain may not generalize effectively due to domain shift. Domain shift can be caused by intra-class variations, camera sensor variations, background variations, and geographical changes. To address these issues, we design an efficient unsupervised domain adaptation network for image classification and object detection that can learn transferable feature representations and reduce the domain shift problem in a unified network. We propose the guided transfer learning approach to select the layers for fine-tuning the model, which enhances feature transferability and utilizes the JS-Divergence to minimize the domain discrepancy between the domains. We evaluate our proposed approaches using multiple benchmark datasets. Our domain adaptive image classification approach achieves 93.2% accuracy on the Office-31 dataset and 75.3% accuracy on the Office-Home dataset. In addition, our domain adaptive object detection approach achieves 51.1% mAP on the Foggy Cityscapes dataset and 72.7% mAP on the Indian Vehicle dataset. We conduct extensive experiments and ablation studies to demonstrate the effectiveness and efficiency of our work. Experimental results also show that our work significantly outperforms the existing methods.

APA, Harvard, Vancouver, ISO, and other styles

32

Kushol, Rafsanjany, Alan H. Wilman, Sanjay Kalra, and Yee-Hong Yang. "DSMRI: Domain Shift Analyzer for Multi-Center MRI Datasets." Diagnostics 13, no. 18 (September 14, 2023): 2947. http://dx.doi.org/10.3390/diagnostics13182947.

Full text

Abstract:

In medical research and clinical applications, the utilization of MRI datasets from multiple centers has become increasingly prevalent. However, inherent variability between these centers presents challenges due to domain shift, which can impact the quality and reliability of the analysis. Regrettably, the absence of adequate tools for domain shift analysis hinders the development and validation of domain adaptation and harmonization techniques. To address this issue, this paper presents a novel Domain Shift analyzer for MRI (DSMRI) framework designed explicitly for domain shift analysis in multi-center MRI datasets. The proposed model assesses the degree of domain shift within an MRI dataset by leveraging various MRI-quality-related metrics derived from the spatial domain. DSMRI also incorporates features from the frequency domain to capture low- and high-frequency information about the image. It further includes the wavelet domain features by effectively measuring the sparsity and energy present in the wavelet coefficients. Furthermore, DSMRI introduces several texture features, thereby enhancing the robustness of the domain shift analysis process. The proposed framework includes visualization techniques such as t-SNE and UMAP to demonstrate that similar data are grouped closely while dissimilar data are in separate clusters. Additionally, quantitative analysis is used to measure the domain shift distance, domain classification accuracy, and the ranking of significant features. The effectiveness of the proposed approach is demonstrated using experimental evaluations on seven large-scale multi-site neuroimaging datasets.

APA, Harvard, Vancouver, ISO, and other styles

33

Sinha, Samarth, Homanga Bharadhwaj, Anirudh Goyal, Hugo Larochelle, Animesh Garg, and Florian Shkurti. "DIBS: Diversity Inducing Information Bottleneck in Model Ensembles." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 11 (May 18, 2021): 9666–74. http://dx.doi.org/10.1609/aaai.v35i11.17163.

Full text

Abstract:

Although deep learning models have achieved state-of-the art performance on a number of vision tasks, generalization over high dimensional multi-modal data, and reliable predictive uncertainty estimation are still active areas of research. Bayesian approaches including Bayesian Neural Nets (BNNs) do not scale well to modern computer vision tasks, as they are difficult to train, and have poor generalization under dataset-shift. This motivates the need for effective ensembles which can generalize and give reliable uncertainty estimates. In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning the stochastic latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. We evaluate our method on benchmark datasets: MNIST, CIFAR100, TinyImageNet and MIT Places 2, and compared to the most competitive baselines show significant improvements in classification accuracy, under a shift in the data distribution and in out-of-distribution detection. over 10% relative improvement in classification accuracy, over 5% relative improvement in generalizing under dataset shift, and over 5% better predictive uncertainty estimation as inferred by efficient out-of-distribution (OOD) detection.

APA, Harvard, Vancouver, ISO, and other styles

34

Heffington, Colton, Brandon Beomseob Park, and Laron K. Williams. "The “Most Important Problem” Dataset (MIPD): a new dataset on American issue importance." Conflict Management and Peace Science 36, no. 3 (March 31, 2017): 312–35. http://dx.doi.org/10.1177/0738894217691463.

Full text

Abstract:

This article introduces the Most Important Problem Dataset (MIPD). The MIPD provides individual-level responses by Americans to “most important problem” questions from 1939 to 2015 coded into 58 different problem categories. The MIPD also contains individual-level information on demographics, economic evaluations, partisan preferences, approval and party competencies. This dataset can help answer questions about how the public prioritizes all problems, domestic and foreign, and we demonstrate how these data can shed light on how circumstances influence foreign policy attentiveness. Our exploratory analysis of foreign policy issue attention reveals some notable patterns about foreign policy public opinion. First, foreign policy issues rarely eclipse economic issues on the public’s problem agenda, so efforts to shift attention from poor economic performance to foreign policy via diversionary maneuvers are unlikely to be successful in the long term. Second, we find no evidence that partisan preferences—whether characterized as partisan identification or ideology—motivate partisans to prioritize different problems owing to perceptions of issue ownership. Instead, Republicans and Democrats, conservatives and liberals, respond in similar fashions to shifting domestic and international conditions.

APA, Harvard, Vancouver, ISO, and other styles

35

Guo, Fumin, Matthew Ng, Maged Goubran, Steffen E. Petersen, Stefan K. Piechnik, Stefan Neubauer, and Graham Wright. "Improving cardiac MRI convolutional neural network segmentation on small training datasets and dataset shift: A continuous kernel cut approach." Medical Image Analysis 61 (April 2020): 101636. http://dx.doi.org/10.1016/j.media.2020.101636.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Vescovi, R. F. C., M. B. Cardoso, and E. X. Miqueles. "Radiography registration for mosaic tomography." Journal of Synchrotron Radiation 24, no. 3 (April 7, 2017): 686–94. http://dx.doi.org/10.1107/s1600577517001953.

Full text

Abstract:

A hybrid method of stitching X-ray computed tomography (CT) datasets is proposed and the feasibility to apply the scheme in a synchrotron tomography beamline with micrometre resolution is shown. The proposed method enables the field of view of the system to be extended while spatial resolution and experimental setup remain unchanged. The approach relies on taking full tomographic datasets at different positions in a mosaic array and registering the frames using Fourier phase correlation and a residue-based correlation. To ensure correlation correctness, the limits for the shifts are determined from the experimental motor position readouts. The masked correlation image is then minimized to obtain the correct shift. The partial datasets are blended in the sinogram space to be compatible with common CT reconstructors. The feasibility to use the algorithm to blend the partial datasets in projection space is also shown, creating a new single dataset, and standard reconstruction algorithms are used to restore high-resolution slices even with a small number of projections.

APA, Harvard, Vancouver, ISO, and other styles

37

Traynor, Carlos, Tarjinder Sahota, Helen Tomkinson, Ignacio Gonzalez-Garcia, Neil Evans, and Michael Chappell. "Imputing Biomarker Status from RWE Datasets—A Comparative Study." Journal of Personalized Medicine 11, no. 12 (December 13, 2021): 1356. http://dx.doi.org/10.3390/jpm11121356.

Full text

Abstract:

Missing data is a universal problem in analysing Real-World Evidence (RWE) datasets. In RWE datasets, there is a need to understand which features best correlate with clinical outcomes. In this context, the missing status of several biomarkers may appear as gaps in the dataset that hide meaningful values for analysis. Imputation methods are general strategies that replace missing values with plausible values. Using the Flatiron NSCLC dataset, including more than 35,000 subjects, we compare the imputation performance of six such methods on missing data: predictive mean matching, expectation-maximisation, factorial analysis, random forest, generative adversarial networks and multivariate imputations with tabular networks. We also conduct extensive synthetic data experiments with structural causal models. Statistical learning from incomplete datasets should select an appropriate imputation algorithm accounting for the nature of missingness, the impact of missing data, and the distribution shift induced by the imputation algorithm. For our synthetic data experiments, tabular networks had the best overall performance. Methods using neural networks are promising for complex datasets with non-linearities. However, conventional methods such as predictive mean matching work well for the Flatiron NSCLC biomarker dataset.

APA, Harvard, Vancouver, ISO, and other styles

38

Wang, Xiaoyang, Chen Li, Jianqiao Zhao, and Dong Yu. "NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 16 (May 18, 2021): 14006–14. http://dx.doi.org/10.1609/aaai.v35i16.17649.

Full text

Abstract:

In this paper, we propose a Chinese multi-turn topic-driven conversation dataset, NaturalConv, which allows the participants to chat anything they want as long as any element from the topic is mentioned and the topic shift is smooth. Our corpus contains 19.9K conversations from six domains, and 400K utterances with an average turn number of 20.1. These conversations contain in-depth discussions on related topics or widely natural transition between multiple topics. We believe either way is normal for human conversation. To facilitate the research on this corpus, we provide results of several benchmark models. Comparative results show that for this dataset, our current models are not able to provide significant improvement by introducing background knowledge/topic. Therefore, the proposed dataset should be a good benchmark for further research to evaluate the validity and naturalness of multi-turn conversation systems. Our dataset is available at https://ai.tencent.com/ailab/nlp/dialogue/#datasets.

APA, Harvard, Vancouver, ISO, and other styles

39

Huch, Sebastian, and Markus Lienkamp. "Towards Minimizing the LiDAR Sim-to-Real Domain Shift: Object-Level Local Domain Adaptation for 3D Point Clouds of Autonomous Vehicles." Sensors 23, no. 24 (December 18, 2023): 9913. http://dx.doi.org/10.3390/s23249913.

Full text

Abstract:

Perception algorithms for autonomous vehicles demand large, labeled datasets. Real-world data acquisition and annotation costs are high, making synthetic data from simulation a cost-effective option. However, training on one source domain and testing on a target domain can cause a domain shift attributed to local structure differences, resulting in a decrease in the model’s performance. We propose a novel domain adaptation approach to address this challenge and to minimize the domain shift between simulated and real-world LiDAR data. Our approach adapts 3D point clouds on the object level by learning the local characteristics of the target domain. A key feature involves downsampling to ensure domain invariance of the input data. The network comprises a state-of-the-art point completion network combined with a discriminator to guide training in an adversarial manner. We quantify the reduction in domain shift by training object detectors with the source, target, and adapted datasets. Our method successfully reduces the sim-to-real domain shift in a distribution-aligned dataset by almost 50%, from 8.63% to 4.36% 3D average precision. It is trained exclusively using target data, making it scalable and applicable to adapt point clouds from any source domain.

APA, Harvard, Vancouver, ISO, and other styles

40

Othman, Walaa, Alexey Kashevnik, Ammar Ali, and Nikolay Shilov. "DriverMVT: In-Cabin Dataset for Driver Monitoring including Video and Vehicle Telemetry Information." Data 7, no. 5 (May 11, 2022): 62. http://dx.doi.org/10.3390/data7050062.

Full text

Abstract:

Developing a driver monitoring system that can assess the driver’s state is a prerequisite and a key to improving the road safety. With the success of deep learning, such systems can achieve a high accuracy if corresponding high-quality datasets are available. In this paper, we introduce DriverMVT (Driver Monitoring dataset with Videos and Telemetry). The dataset contains information about the driver head pose, heart rate, and driver behaviour inside the cabin like drowsiness and unfastened belt. This dataset can be used to train and evaluate deep learning models to estimate the driver’s health state, mental state, concentration level, and his/her activity in the cabin. Developing such systems that can alert the driver in case of drowsiness or distraction can reduce the number of accidents and increase the safety on the road. The dataset contains 1506 videos for 9 different drivers (7 males and 2 females) with total number of frames equal 5119k and total time over 36 h. In addition, evaluated the dataset with multi-task temporal shift convolutional attention network (MTTS-CAN) algorithm. The algorithm mean average error on our dataset is 16.375 heartbeats per minute.

APA, Harvard, Vancouver, ISO, and other styles

41

Ishihara, Kazuaki, and Koutarou Matsumoto. "Comparing the Robustness of ResNet, Swin-Transformer, and MLP-Mixer under Unique Distribution Shifts in Fundus Images." Bioengineering 10, no. 12 (December 1, 2023): 1383. http://dx.doi.org/10.3390/bioengineering10121383.

Full text

Abstract:

Background: Diabetic retinopathy (DR) is the leading cause of visual impairment and blindness. Consequently, numerous deep learning models have been developed for the early detection of DR. Safety-critical applications employed in medical diagnosis must be robust to distribution shifts. Previous studies have focused on model performance under distribution shifts using natural image datasets such as ImageNet, CIFAR-10, and SVHN. However, there is a lack of research specifically investigating the performance using medical image datasets. To address this gap, we investigated trends under distribution shifts using fundus image datasets. Methods: We used the EyePACS dataset for DR diagnosis, introduced noise specific to fundus images, and evaluated the performance of ResNet, Swin-Transformer, and MLP-Mixer models under a distribution shift. The discriminative ability was evaluated using the Area Under the Receiver Operating Characteristic curve (ROC-AUC), while the calibration ability was evaluated using the monotonic sweep calibration error (ECE sweep). Results: Swin-Transformer exhibited a higher ROC-AUC than ResNet under all types of noise and displayed a smaller reduction in the ROC-AUC due to noise. ECE sweep did not show a consistent trend across different model architectures. Conclusions: Swin-Transformer consistently demonstrated superior discrimination compared to ResNet. This trend persisted even under unique distribution shifts in the fundus images.

APA, Harvard, Vancouver, ISO, and other styles

42

Takahashi, Satoshi, Masamichi Takahashi, Manabu Kinoshita, Mototaka Miyake, Jun Sese, Kazuma Kobayashi, Koichi Ichimura, Yoshitaka Narita, Ryuji Hamamoto, and Consortium of Molecular Diagnosis of glioma. "RBIO-03. INITIAL RESULT OF DEVELOP ROBUST DEEP LEARNING MODEL FOR DETECTING GENOMIC STATUS IN GLIOMAS AGAINST IMAGE DIFFERENCES AMONG FACILITIES." Neuro-Oncology 23, Supplement_6 (November 2, 2021): vi192. http://dx.doi.org/10.1093/neuonc/noab196.760.

Full text

Abstract:

Abstract BACKGROUND The importance of detecting the genomic status of gliomas is increasingly recognized and IDH (isocitrate dehydrogenase) mutation and TERT (telomerase reverse transcriptase) promoter mutation have a significant impact on treatment decisions. Noninvasive prediction of these genomic statuses in gliomas is a challenging problem; however, a deep learning model using magnetic resonance imaging (MRI) can be a solution. The image differences among facilities causing performance degradation, called domain shift, have also been reported in other tasks such as brain tumor segmentation. We investigated whether a deep learning model could predict the gene status, and if so, to what extent it would be affected by domain shift. METHOD We used the Multimodal Brain Tumor Segmentation Challenge (BraTS) data and the Japanese cohort (JC) dataset consisted of brain tumor images collected from 544 patients in 10 facilities in Japan. We focused on IDH mutation and TERT promoter mutation. The deep learning models to predict the statuses of these genes were trained by the BraTS dataset or the training portion of the JC dataset, and the test portion of the JC dataset evaluated the accuracy of the models. RESULTS The IDH mutation predicting model trained by the BraTS dataset showed 80.0% accuracy for the validation portion of the BraTS dataset; however, only 67.3% for the test portion of the JC dataset. The TERT promoter mutation predicting model trained by the training portion of the JC dataset showed only 49% accuracy for the test portion of the JC dataset. CONCLUSION IDH mutation can be predicted by deep learning models using MRI, but the performance degeneration by domain shift was significant. On the other hand, TERT promoter mutation could not be predicted accurately enough by current deep learning techniques. In both mutations, further studies are needed.

APA, Harvard, Vancouver, ISO, and other styles

43

Allen, Robert C., Mattia C. Bertazzini, and Leander Heldring. "The Economic Origins of Government." American Economic Review 113, no. 10 (October 1, 2023): 2507–45. http://dx.doi.org/10.1257/aer.20201919.

Full text

Abstract:

We test between cooperative and extractive theories of the origins of government. We use river shifts in southern Iraq as a natural experiment, in a new archeological panel dataset. A shift away creates a local demand for a government to coordinate because private river irrigation needs to be replaced with public canals. It disincentivizes local extraction as land is no longer productive without irrigation. Consistent with a cooperative theory of government, a river shift away led to state formation, canal construction, and the payment of tribute. We argue that the first governments coordinated between extended households which implemented public good provision. (JEL D72, H11, H41, N45, N55, Q15)

APA, Harvard, Vancouver, ISO, and other styles

44

Wu, Teng, Bruno Vallet, Marc Pierrot-Deseilligny, and Ewelina Rupnik. "An evaluation of Deep Learning based stereo dense matching dataset shift from aerial images and a large scale stereo dataset." International Journal of Applied Earth Observation and Geoinformation 128 (April 2024): 103715. http://dx.doi.org/10.1016/j.jag.2024.103715.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Asopa, U., S. Kumar, and P. K. Thakur. "PSInSAR Study of Lyngenfjord Norway, using TerraSAR-X Data." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-5 (November 15, 2018): 245–51. http://dx.doi.org/10.5194/isprs-annals-iv-5-245-2018.

Full text

Abstract:

<p><strong>Abstract.</strong> In this research paper, focus is given on exploring the potential of Persistent Scatterer Interferometric Synthetic Aperture Radar (PSInSAR) technique for the measurement of landslide, which is the extension of existing DInSAR technique. In PSInSAR technique, the movement is measured by finding the phase shift in the scatterers present in the study area through the course of time. The backscattering of such a scatterer does not change during the study. By using this technique, 32 datasets acquired over a period of time i.e. from 2009 to 2011 over the area of Troms County of Lyngen Fjord, Norway are analysed. The dataset utilised are acquired with TerraSAR-X and TanDEM-X pair, in Stripmap mode of acquisition. Coregistration of dataset with subpixel accuracy is done with master images is done to align all the dataset correctly. APS estimation is done in order to remove the phase decorrelation caused by the atmosphere, movement, etc. using algorithms for phase unwrapping which allowed the processing of sparse data and the effect of atmosphere is reduced by doing analysis on temporal basis of the phase shift in interferograms of successive datasets. By this study it has been tried to show the estimation of shift can be done by the temporal analysis of the data acquired by TerraSAR-X. The velocity output is displayed in a map reflecting the velocity of movement. Apart from this, the data properties such as baseline distribution both temporal and spatial are displayed in a chart. Other outputs obtained are the atmospheric Phase Screen, sparse point distribution, reflectivity map of the study area etc. are displayed using a map of terrain. The output velocity obtained of the terrain movement is found to be in the range of &minus;40<span class="thinspace"></span>mm/yr to &minus;70<span class="thinspace"></span>mm/yr.</p>

APA, Harvard, Vancouver, ISO, and other styles

46

Tang, Yansong, Xingyu Liu, Xumin Yu, Danyang Zhang, Jiwen Lu, and Jie Zhou. "Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition." ACM Transactions on Multimedia Computing, Communications, and Applications 18, no. 2 (May 31, 2022): 1–24. http://dx.doi.org/10.1145/3472722.

Full text

Abstract:

Rapid progress and superior performance have been achieved for skeleton-based action recognition recently. In this article, we investigate this problem under a cross-dataset setting, which is a new, pragmatic, and challenging task in real-world scenarios. Following the unsupervised domain adaptation (UDA) paradigm, the action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. Different from the conventional adversarial learning-based approaches for UDA, we utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. Our inspiration is drawn from Cubism, an art genre from the early 20th century, which breaks and reassembles the objects to convey a greater context. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks to explore the temporal and spatial dependency of a skeleton-based action and improve the generalization ability of the model. We conduct experiments on six datasets for skeleton-based action recognition, including three large-scale datasets (NTU RGB+D, PKU-MMD, and Kinetics) where new cross-dataset settings and benchmarks are established. Extensive results demonstrate that our method outperforms state-of-the-art approaches. The source codes of our model and all the compared methods are available at https://github.com/shanice-l/st-cubism.

APA, Harvard, Vancouver, ISO, and other styles

47

Guentchev, Galina, Joseph J. Barsugli, and Jon Eischeid. "Homogeneity of Gridded Precipitation Datasets for the Colorado River Basin." Journal of Applied Meteorology and Climatology 49, no. 12 (December 1, 2010): 2404–15. http://dx.doi.org/10.1175/2010jamc2484.1.

Full text

Abstract:

Abstract Inhomogeneity in gridded meteorological data may arise from the inclusion of inhomogeneous station data or from aspects of the gridding procedure itself. However, the homogeneity of gridded datasets is rarely questioned, even though an analysis of trends or variability that uses inhomogeneous data could be misleading or even erroneous. Three gridded precipitation datasets that have been used in studies of the Upper Colorado River basin were tested for homogeneity in this study: that of Maurer et al., that of Beyene and Lettenmaier, and the Parameter–Elevation Regressions on Independent Slopes Model (PRISM) dataset of Daly et al. Four absolute homogeneity tests were applied to annual precipitation amounts on a grid cell and on a hydrologic subregion spatial scale for the periods 1950–99 and 1916–2006. The analysis detects breakpoints in 1977 and 1978 at many locations in all three datasets that may be due to an anomalously rapid shift in the Pacific decadal oscillation. One dataset showed breakpoints in the 1940s that might be due to the widespread change in the number of available observing stations used as input for that dataset. The results also indicated that the time series from the three datasets are sufficiently homogeneous for variability analysis during the 1950–99 period when aggregated on a subregional scale.

APA, Harvard, Vancouver, ISO, and other styles

48

Sime, Louise C., Richard C. A. Hindmarsh, and Hugh Corr. "Automated processing to derive dip angles of englacial radar reflectors in ice sheets." Journal of Glaciology 57, no. 202 (2011): 260–66. http://dx.doi.org/10.3189/002214311796405870.

Full text

Abstract:

AbstractWe present a novel automated processing method for obtaining layer dip from radio-echo sounding (RES) data. The method is robust, easily applicable and can be used to process large (several terabytes) ground and airborne RES datasets using modest computing resources. We give test results from the application of the method to two Antarctic datasets: the Fletcher Promontory ground-based radar dataset and the Wilkes Subglacial Basin airborne radar dataset. The automated RES processing (ARESP) method comprises the basic steps: (1) RES noise reduction; (2) radar layer identification; (3) isolation of individual ‘layer objects’; (4) measurement of orientation and other object properties; (5) elimination of noise in the orientation data; and (6) collation of the valid dip information. The apparent dip datasets produced by the method will aid glaciologists seeking to understand ice-flow dynamics in Greenland and Antarctica: ARESP could enable a shift from selective regional case studies to ice-sheet-scale studies.

APA, Harvard, Vancouver, ISO, and other styles

49

Sharif, Muhammad Imran, Muhammad Attique Khan, Abdullah Alqahtani, Muhammad Nazir, Shtwai Alsubai, Adel Binbusayyis, and Robertas Damaševičius. "Deep Learning and Kurtosis-Controlled, Entropy-Based Framework for Human Gait Recognition Using Video Sequences." Electronics 11, no. 3 (January 21, 2022): 334. http://dx.doi.org/10.3390/electronics11030334.

Full text

Abstract:

Gait is commonly defined as the movement pattern of the limbs over a hard substrate, and it serves as a source of identification information for various computer-vision and image-understanding techniques. A variety of parameters, such as human clothing, angle shift, walking style, occlusion, and so on, have a significant impact on gait-recognition systems, making the scene quite complex to handle. In this article, we propose a system that effectively handles problems associated with viewing angle shifts and walking styles in a real-time environment. The following steps are included in the proposed novel framework: (a) real-time video capture, (b) feature extraction using transfer learning on the ResNet101 deep model, and (c) feature selection using the proposed kurtosis-controlled entropy (KcE) approach, followed by a correlation-based feature fusion step. The most discriminant features are then classified using the most advanced machine learning classifiers. The simulation process is fed by the CASIA B dataset as well as a real-time captured dataset. On selected datasets, the accuracy is 95.26% and 96.60%, respectively. When compared to several known techniques, the results show that our proposed framework outperforms them all.

APA, Harvard, Vancouver, ISO, and other styles

50

Hidalgo Davila, Mateo, Maria Baldeon-Calisto, Juan Jose Murillo, Bernardo Puente-Mejia, Danny Navarrete, Daniel Riofrío, Noel Peréz, Diego S. Benítez, and Ricardo Flores Moyano. "Analyzing the Effect of Basic Data Augmentation for COVID-19 Detection through a Fractional Factorial Experimental Design." Emerging Science Journal 7 (September 24, 2022): 1–16. http://dx.doi.org/10.28991/esj-2023-sper-01.

Full text

Abstract:

The COVID-19 pandemic has created a worldwide healthcare crisis. Convolutional Neural Networks (CNNs) have recently been used with encouraging results to help detect COVID-19 from chest X-ray images. However, to generalize well to unseen data, CNNs require large labeled datasets. Due to the lack of publicly available COVID-19 datasets, most CNNs apply various data augmentation techniques during training. However, there has not been a thorough statistical analysis of how data augmentation operations affect classification performance for COVID-19 detection. In this study, a fractional factorial experimental design is used to examine the impact of basic augmentation methods on COVID-19 detection. The latter enables identifying which particular data augmentation techniques and interactions have a statistically significant impact on the classification performance, whether positively or negatively. Using the CoroNet architecture and two publicly available COVID-19 datasets, the most common basic augmentation methods in the literature are evaluated. The results of the experiments demonstrate that the methods of zoom, range, and height shift positively impact the model's accuracy in dataset 1. The performance of dataset 2 is unaffected by any of the data augmentation operations. Additionally, a new state-of-the-art performance is achieved on both datasets by training CoroNet with the ideal data augmentation values found using the experimental design. Specifically, in dataset 1, 97% accuracy, 93% precision, and 97.7% recall were attained, while in dataset 2, 97% accuracy, 97% precision, and 97.6% recall were achieved. These results indicate that analyzing the effects of data augmentations on a particular task and dataset is essential for the best performance. Doi: 10.28991/ESJ-2023-SPER-01 Full Text: PDF

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!