Journal articles on the topic 'Mixed data types'

To see the other types of publications on this topic, follow the link: Mixed data types.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Mixed data types.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Lu, Dongxiao Zhu, and Ming Dong. "Clustering over-dispersed data with mixed feature types." Statistical Analysis and Data Mining: The ASA Data Science Journal 11, no. 2 (January 10, 2018): 55–65. http://dx.doi.org/10.1002/sam.11369.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Sánchez-Borrego, I., J. D. Opsomer, M. Rueda, and A. Arcos. "Nonparametric estimation with mixed data types in survey sampling." Revista Matemática Complutense 27, no. 2 (December 5, 2013): 685–700. http://dx.doi.org/10.1007/s13163-013-0142-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Jørgensen, Bent, Søren Lundbye-Christensen, Peter Xue-Kun Song, and Li Sun. "State-space models for multivariate longitudinal data of mixed types." Canadian Journal of Statistics 24, no. 3 (September 1996): 385–402. http://dx.doi.org/10.2307/3315747.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Einarsson, Bo. "Mixed language programming realization and the provision of data types." ACM SIGNUM Newsletter 21, no. 1-2 (April 1986): 2–9. http://dx.doi.org/10.1145/15983.15984.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Yoon, Grace, Raymond J. Carroll, and Irina Gaynanova. "Sparse semiparametric canonical correlation analysis for data of mixed types." Biometrika 107, no. 3 (April 15, 2020): 609–25. http://dx.doi.org/10.1093/biomet/asaa007.

Full text
Abstract:
Summary Canonical correlation analysis investigates linear relationships between two sets of variables, but it often works poorly on modern datasets because of high dimensionality and mixed data types such as continuous, binary and zero-inflated. To overcome these challenges, we propose a semiparametric approach to sparse canonical correlation analysis based on the Gaussian copula. The main result of this paper is a truncated latent Gaussian copula model for data with excess zeros, which allows us to derive a rank-based estimator of the latent correlation matrix for mixed variable types without estimation of marginal transformation functions. The resulting canonical correlation analysis method works well in high-dimensional settings, as demonstrated via numerical studies, and when applied to the analysis of association between gene expression and microRNA data from breast cancer patients.
APA, Harvard, Vancouver, ISO, and other styles
6

Amatya, Anup, and Hakan Demirtas. "Concurrent generation of multivariate mixed data with variables of dissimilar types." Journal of Statistical Computation and Simulation 86, no. 18 (April 22, 2016): 3595–607. http://dx.doi.org/10.1080/00949655.2016.1177530.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Sun, Jinhui, Pang Du, Hongyu Miao, and Hua Liang. "Robust feature screening procedures for single and mixed types of data." Journal of Statistical Computation and Simulation 90, no. 7 (February 4, 2020): 1173–93. http://dx.doi.org/10.1080/00949655.2020.1719104.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Vogl, Susanne. "Integrating and Consolidating Data in Mixed Methods Data Analysis: Examples From Focus Group Data With Children." Journal of Mixed Methods Research 13, no. 4 (August 31, 2018): 536–54. http://dx.doi.org/10.1177/1558689818796364.

Full text
Abstract:
The challenge in data analysis often lies in accounting for the multidimensionality and complexity of the data while simultaneously discovering patterns. Integrating and consolidating different types of data during analysis can broaden the perspective and permit obtaining complementary views. This methodological research study on data collection illustrates how one type of data collection generates different types of data, which can be linked and consolidated to reach a better understanding of the topic. Procedures and practicalities are illustrated to offer a good practice example for data integration and consolidation. With the methodological reflection of research practice, I evaluate the consequences for the field of mixed methods research, in which the practicalities of an integrated mixed analysis still need to be elaborated.
APA, Harvard, Vancouver, ISO, and other styles
9

Harch, B. D., K. E. Basford, I. H. DeLacy, P. K. Lawrence, and A. Cruickshank. "Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data." Genetic Resources and Crop Evolution 43, no. 4 (August 1996): 363–76. http://dx.doi.org/10.1007/bf00132957.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Huang, Mingze, Christian Müller, and Irina Gaynanova. "latentcor: An R Package for estimating latent correlations from mixed data types." Journal of Open Source Software 6, no. 65 (September 21, 2021): 3634. http://dx.doi.org/10.21105/joss.03634.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Brus, D. J., P. A. Slim, G. Gort, A. H. Heidema, and H. van Dobben. "Monitoring habitat types by the mixed multinomial logit model using panel data." Ecological Indicators 67 (August 2016): 108–16. http://dx.doi.org/10.1016/j.ecolind.2016.02.043.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Zhao, Jing Hua, and Jian'an Luan. "Mixed Modeling with Whole Genome Data." Journal of Probability and Statistics 2012 (2012): 1–16. http://dx.doi.org/10.1155/2012/485174.

Full text
Abstract:
Objective. We consider the need for a modeling framework for related individuals and various sources of variations. The relationships could either be among relatives in families or among unrelated individuals in a general population with cryptic relatedness; both could be refined or derived with whole genome data. As with variations they can include oliogogenes, polygenes, single nucleotide polymorphism (SNP), and covariates.Methods. We describe mixed models as a coherent theoretical framework to accommodate correlations for various types of outcomes in relation to many sources of variations. The framework also extends to consortium meta-analysis involving both population-based and family-based studies.Results. Through examples we show that the framework can be furnished with general statistical packages whose great advantage lies in simplicity and exibility to study both genetic and environmental effects. Areas which require further work are also indicated.Conclusion. Mixed models will play an important role in practical analysis of data on both families and unrelated individuals when whole genome information is available.
APA, Harvard, Vancouver, ISO, and other styles
13

Hasanpour, Hesam, Ramak Ghavamizadeh Meibodi, Keivan Navi, and Sareh Asadi. "Dealing with mixed data types in the obsessive-compulsive disorder using ensemble classification." Neurology, Psychiatry and Brain Research 32 (June 2019): 77–84. http://dx.doi.org/10.1016/j.npbr.2019.04.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Tarsitano, Agostino, and Marianna Falcone. "Missing-Values Adjustment for Mixed-Type Data." Journal of Probability and Statistics 2011 (2011): 1–20. http://dx.doi.org/10.1155/2011/290380.

Full text
Abstract:
We propose a new method of single imputation, reconstruction, and estimation of nonreported, incorrect, implausible, or excluded values in more than one field of the record. In particular, we will be concerned with data sets involving a mixture of numeric, ordinal, binary, and categorical variables. Our technique is a variation of the popular nearest neighbor hot deck imputation (NNHDI) where “nearest” is defined in terms of a global distance obtained as a convex combination of the distance matrices computed for the various types of variables. We address the problem of proper weighting of the partial distance matrices in order to reflect their significance, reliability, and statistical adequacy. Performance of several weighting schemes is compared under a variety of settings in coordination with imputation of the least power mean of the Box-Cox transformation applied to the values of the donors. Through analysis of simulated and actual data sets, we will show that this approach is appropriate. Our main contribution has been to demonstrate that mixed data may optimally be combined to allow the accurate reconstruction of missing values in the target variable even when some data are absent from the other fields of the record.
APA, Harvard, Vancouver, ISO, and other styles
15

Coombes, Caitlin E., Zachary B. Abrams, Samantha Nakayiza, Guy Brock, and Kevin R. Coombes. "Umpire 2.0: Simulating realistic, mixed-type, clinical data for machine learning." F1000Research 9 (March 5, 2021): 1186. http://dx.doi.org/10.12688/f1000research.25877.2.

Full text
Abstract:
The Umpire 2.0 R-package offers a streamlined, user-friendly workflow to simulate complex, heterogeneous, mixed-type data with known subgroup identities, dichotomous outcomes, and time-to-event data, while providing ample opportunities for fine-tuning and flexibility. Here, we describe how we have expanded the core Umpire 1.0 R-package, developed to simulate gene expression data, to generate clinically realistic, mixed-type data for use in evaluating unsupervised and supervised machine learning (ML) methods. As the availability of large-scale clinical data for ML has increased, clinical data has posed unique challenges, including widely variable size, individual biological heterogeneity, data collection and measurement noise, and mixed data types. Developing and validating ML methods for clinical data requires data sets with known ground truth, generated from simulation. Umpire 2.0 addresses challenges to simulating realistic clinical data by providing the user a series of modules to generate survival parameters and subgroups, apply meaningful additive noise, and discretize to single or mixed data types. Umpire 2.0 provides broad functionality across sample sizes, feature spaces, and data types, allowing the user to simulate correlated, heterogeneous, binary, continuous, categorical, or mixed type data from the scale of a small clinical trial to data on thousands of patients drawn from electronic health records. The user may generate elaborate simulations by varying parameters in order to compare algorithms or interrogate operating characteristics of an algorithm in both supervised and unsupervised ML.
APA, Harvard, Vancouver, ISO, and other styles
16

Coombes, Caitlin E., Zachary B. Abrams, Samantha Nakayiza, Guy Brock, and Kevin R. Coombes. "Umpire 2.0: Simulating realistic, mixed-type, clinical data for machine learning." F1000Research 9 (October 1, 2020): 1186. http://dx.doi.org/10.12688/f1000research.25877.1.

Full text
Abstract:
The Umpire 2.0 R-package offers a streamlined, user-friendly workflow to simulate complex, heterogeneous, mixed-type data with known subgroup identities, dichotomous outcomes, and time-to-event data, while providing ample opportunities for fine-tuning and flexibility. Here, we describe how we have expanded the core Umpire 1.0 R-package, developed to simulate gene expression data, to generate clinically realistic, mixed-type data for use in evaluating unsupervised and supervised machine learning (ML) methods. As the availability of large-scale clinical data for ML has increased, clinical data has posed unique challenges, including widely variable size, individual biological heterogeneity, data collection and measurement noise, and mixed data types. Developing and validating ML methods for clinical data requires data sets with known ground truth, generated from simulation. Umpire 2.0 addresses challenges to simulating realistic clinical data by providing the user a series of modules to generate survival parameters and subgroups, apply meaningful additive noise, and discretize to single or mixed data types. Umpire 2.0 provides broad functionality across sample sizes, feature spaces, and data types, allowing the user to simulate correlated, heterogeneous, binary, continuous, categorical, or mixed type data from the scale of a small clinical trial to data on thousands of patients drawn from electronic health records. The user may generate elaborate simulations by varying parameters in order to compare algorithms or interrogate operating characteristics of an algorithm in both supervised and unsupervised ML.
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Yurong, and Yannan Luo. "Research of Wind Power Correlation With Three Different Data Types Based on Mixed Copula." IEEE Access 6 (2018): 77986–95. http://dx.doi.org/10.1109/access.2018.2884539.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Storlie, C. B., S. M. Myers, S. K. Katusic, A. L. Weaver, R. G. Voigt, P. E. Croarkin, R. E. Stoeckel, and J. D. Port. "Clustering and variable selection in the presence of mixed variable types and missing data." Statistics in Medicine 37, no. 19 (May 17, 2018): 2884–99. http://dx.doi.org/10.1002/sim.7697.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Zhang, Weiping, MengMeng Zhang, and Yu Chen. "A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses." Sankhya B 82, no. 2 (June 11, 2019): 353–79. http://dx.doi.org/10.1007/s13571-019-00197-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Mami, Ahmed M., and Ayman Ali Elberjo. "ON USING NONPARAMETRIC REGRESSION METHODS TO ESTIMATE CATEGORICAL OUTCOMES MODELS WITH MIXED DATA TYPES." EPH - International Journal of Applied Science 1, no. 3 (September 27, 2015): 15–22. http://dx.doi.org/10.53555/eijas.v1i3.15.

Full text
Abstract:
Many data analysis methods are sensitive to the type of data under study. When we begin any statistical data analysis, it is very important to recognize the different types of data. Data can take a variety of values or belong to various categories, whichever numerical or nominal. However, there are two types of data, quantitative and qualitative (Categorical) data. The general and powerful methodological approaches for the analysis of quantitative data have been widely taught for several decades. While the analysis for qualitative data analysis have blossomed only in the past 25 years. The need for analysis of categorical data techniques has increased steadily in recent years, in economic, health, social science. However, analysis of categorical data models when the dependent variable binary or multinomial outcomes with mixed explanatory variables are complex. The main goal of this paper is to estimate a nonparametric regression model of the binary and multinomial outcomes models with mixed explanatory variables, it is based on nonparametric conditional CDF method and (PDF) method of bandwidth selection, presented by Li and Racine (2008). Then we have compared it with one of the most common method of parametric regression (the logistic regression model). The comparisons will be based on two criteria depends on their classification ability through Correct Classification Ratio CCR as well as their log likelihood value LLK. We conducted several simulation studies using generated random data (categorical discrete and continues) in order to investigate the performance of both the parametric model and the nonparametric model for binary and multinomial outcomes. Interesting results have been achieved in this work. Application on real-data have also been applied when there exist mixed variables. We make use of dataset of the Household Expenditure Survey (HES).
APA, Harvard, Vancouver, ISO, and other styles
21

Grando, Adela, Julia Ivanova, Megan Hiestand, Hiral Soni, Anita Murcko, Michael Saks, David Kaufman, et al. "Mental health professional perspectives on health data sharing: Mixed methods study." Health Informatics Journal 26, no. 3 (January 11, 2020): 2067–82. http://dx.doi.org/10.1177/1460458219893848.

Full text
Abstract:
This study explores behavioral health professionals’ perceptions of granular data. Semi-structured in-person interviews of 20 health professionals were conducted at two different sites. Qualitative and quantitative analysis was performed. While most health professionals agreed that patients should control who accesses their personal medical record (70%), there are certain types of health information that should never be restricted (65%). Emergent themes, including perceived reasons that patients might share or withhold certain types of health information (65%), care coordination (12%), patient comprehension (11%), stigma (5%), trust (3%), sociocultural understanding (3%), and dissatisfaction with consent processes (1%), are explored. The impact of care role (prescriber or non-prescriber) on data-sharing perception is explored as well. This study informs the discussion on developing technology that helps balance provider and patient data-sharing and access needs.
APA, Harvard, Vancouver, ISO, and other styles
22

Liu, Zhenyu, Tao Wen, Wei Sun, and Qilong Zhang. "A Novel Multiway Splits Decision Tree for Multiple Types of Data." Mathematical Problems in Engineering 2020 (November 12, 2020): 1–12. http://dx.doi.org/10.1155/2020/7870534.

Full text
Abstract:
Classical decision trees such as C4.5 and CART partition the feature space using axis-parallel splits. Oblique decision trees use the oblique splits based on linear combinations of features to potentially simplify the boundary structure. Although oblique decision trees have higher generalization accuracy, most oblique split methods are not directly conducive to the categorical data and are computationally expensive. In this paper, we propose a multiway splits decision tree (MSDT) algorithm, which adopts feature weighting and clustering. This method can combine multiple numerical features, multiple categorical features, or multiple mixed features. Experimental results show that MSDT has excellent performance for multiple types of data.
APA, Harvard, Vancouver, ISO, and other styles
23

Choi, Wonei, Hanlim Lee, and Jeonghyeon Park. "A First Approach to Aerosol Classification Using Space-Borne Measurement Data: Machine Learning-Based Algorithm and Evaluation." Remote Sensing 13, no. 4 (February 8, 2021): 609. http://dx.doi.org/10.3390/rs13040609.

Full text
Abstract:
A new method was developed for classifying aerosol types involving a machine-learning approach to the use of satellite data. An Aerosol Robotic NETwork (AERONET)-based aerosol-type dataset was used as a target variable in a random forest (RF) model. The contributions of satellite input variables to the RF-based model were quantified to determine an optimal set of input variables. The new method, based on inputs of satellite variables, allows the classification of seven aerosol types: pure dust, dust-dominant mixed, pollution-dominant mixed aerosols, and pollution aerosols (strongly, moderately, weakly, and non-absorbing). The performance of the model was statistically evaluated using AERONET data excluded from the model training dataset. Model accuracy for classifying the seven aerosol types was 59%, improving to 72% for four types (pure dust, dust-dominant mixed, strongly absorbing, and non-absorbing). The performance of the model was evaluated against an earlier aerosol classification method based on the wavelength dependence of single-scattering albedo (SSA) and fine-mode-fraction values from AERONET. Typical wavelength dependences of SSA for individual aerosol types are consistent with those obtained for aerosol types by the new method. This study demonstrates that an RF-based model is capable of satellite aerosol classification with sensitivity to the contribution of non-spherical particles.
APA, Harvard, Vancouver, ISO, and other styles
24

Bhat, Chandra R. "A new generalized heterogeneous data model (GHDM) to jointly model mixed types of dependent variables." Transportation Research Part B: Methodological 79 (September 2015): 50–77. http://dx.doi.org/10.1016/j.trb.2015.05.017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Choi, Y., Y. S. Ghim, and B. N. Holben. "Identification of column-integrated dominant aerosols using the archive of AERONET data set." Atmospheric Chemistry and Physics Discussions 13, no. 10 (October 15, 2013): 26627–56. http://dx.doi.org/10.5194/acpd-13-26627-2013.

Full text
Abstract:
Abstract. Dominant aerosols were distinguished from level 2 inversion products for the Anmyon Aerosol Robotic Network (AERONET) site between 1999 and 2007. Secondary inorganic ions, black carbon (BC) and organic carbon (OC) were separated from fine mode aerosols, and mineral dust (MD), MD mixed with carbon, mixed coarse particles were separated from coarse mode aerosols. Four parameters (aerosol optical depth, single scattering albedo, absorption Angstrom exponent, and fine mode fraction) were used for this classification. Monthly variation of the occurrence rate of each aerosol type reveals that MD and MD mixed with carbon are frequent in spring. Although the fraction among dominant aerosols and occurrence rates of BC and OC tend to be high in cold season for heating, their contributions are variable but consistent due to various combustion sources. Secondary inorganic ions are most prevalent from June to August; the effective radius of these fine mode aerosols increases with water vapor content because of hygroscopic growth. To evaluate the validity of aerosol types identified, dominant aerosols at worldwide AERONET sites (Beijing, Mexico City, Goddard Space Flight Center, Mongu, Alta Floresta, Cape Verde), which have distinct source characteristics, were classified into the same aerosol types. The occurrence rate and fraction of the aerosol types at the selected sites confirm that the classification in this study is reasonable. However, mean optical properties of the aerosol types are generally influenced by the aerosol types with large fractions. The present work shows that the identification of dominant aerosols is effective even at a single site, provided that the archive of the data set is properly available.
APA, Harvard, Vancouver, ISO, and other styles
26

Trouvé, Raphael, Ruizhu Jiang, Melissa Fedrigo, Matt D. White, Sabine Kasel, Patrick J. Baker, and Craig R. Nitschke. "Combining Environmental, Multispectral, and LiDAR Data Improves Forest Type Classification: A Case Study on Mapping Cool Temperate Rainforests and Mixed Forests." Remote Sensing 15, no. 1 (December 22, 2022): 60. http://dx.doi.org/10.3390/rs15010060.

Full text
Abstract:
Predictive vegetation mapping is an essential tool for managing and conserving high conservation-value forests. Cool temperate rainforests (Rainforest) and cool temperate mixed forests (Mixed Forest, i.e., rainforest spp. overtopped by large remnant Eucalyptus trees) are threatened forest types in the Central Highlands of Victoria. Logging of these forest types is prohibited; however, the surrounding native Eucalyptus forests can be logged in some areas of the landscape. This requires accurate mapping and delineation of these vegetation types. In this study, we combine niche modelling, multispectral imagery, and LiDAR data to improve predictive vegetation mapping of these two threatened ecosystems in southeast Australia. We used a dataset of 1586 plots partitioned into four distinct forest types that occur in close proximity in the Central Highlands: Eucalyptus, Tree fern, Mixed Forest, and Rainforest. We calibrated our model on a training dataset and validated it on a spatially distinct testing dataset. To avoid overfitting, we used Bayesian regularized multinomial regression to relate predictors to our four forest types. We found that multispectral predictors were able to distinguish Rainforest from Eucalyptus forests due to differences in their spectral signatures. LiDAR-derived predictors were effective at discriminating Mixed Forest from Rainforest based on forest structure, particularly LiDAR predictors based on existing domain knowledge of the system. For example, the best predictor of Mixed Forest was the presence of Rainforest-type understorey overtopped by large Eucalyptus crowns, which is effectively aligned with the regulatory definition of Mixed Forest. Environmental predictors improved model performance marginally, but helped discriminate riparian forests from Rainforest. However, the best model for classifying forest types was the model that included all three classes of predictors (i.e., spectral, structural, and environmental). Using multiple data sources with differing strengths improved classification accuracy and successfully predicted the identity of 88% of the plots. Our study demonstrated that multi-source methods are important for capturing different properties of the data that discriminate ecosystems. In addition, the multi-source approach facilitated adding custom metrics based on domain knowledge which in turn improved the mapping of high conservation-value forest.
APA, Harvard, Vancouver, ISO, and other styles
27

CORTÉS, Antoni, Marta CASCANTE, María Luz CÁRDENAS, and Athel CORNISH-BOWDEN. "Relationships between inhibition constants, inhibitor concentrations for 50% inhibition and types of inhibition: new ways of analysing data." Biochemical Journal 357, no. 1 (June 25, 2001): 263–68. http://dx.doi.org/10.1042/bj3570263.

Full text
Abstract:
The concentration of an inhibitor that decreases the rate of an enzyme-catalysed reaction by 50%, symbolized i0.5, is often used in pharmacological studies to characterize inhibitors. It can be estimated from the common inhibition plots used in biochemistry by means of the fact that the extrapolated inhibitor concentration at which the rate becomes infinite is equal to −i0.5. This method is, in principle, more accurate than comparing the rates at various different inhibitor concentrations, and inferring the value of i0.5 by interpolation. Its reciprocal, 1/i0.5, is linearly dependent on v0/V, the uninhibited rate divided by the limiting rate, and the extrapolated value of v0/V at which 1/i0.5 is zero allows the type of inhibition to be characterized: this value is 1 if the inhibition is strictly competitive; greater than 1 if the inhibition is mixed with a predominantly competitive component; infinite (i.e. 1/i0.5 does not vary with v0/V) if the inhibition is pure non-competitive (i.e. mixed with competitive and uncompetitive components equal); negative if the inhibition is mixed with a predominantly uncompetitive component; and zero if it is strictly uncompetitive. The type of analysis proposed has been tested experimentally by examining inhibition of lactate dehydrogenase by oxalate (an uncompetitive inhibitor with respect to pyruvate) and oxamate (a competitive inhibitor with respect to pyruvate), and of cytosolic malate dehydrogenase by hydroxymalonate (a mixed inhibitor with respect to oxaloacetate). In all cases there is excellent agreement between theory and experiment.
APA, Harvard, Vancouver, ISO, and other styles
28

Dawadi, Saraswati, Sagun Shrestha, and Ram A. Giri. "Mixed-Methods Research: A Discussion on its Types, Challenges, and Criticisms." Journal of Practical Studies in Education 2, no. 2 (February 24, 2021): 25–36. http://dx.doi.org/10.46809/jpse.v2i2.20.

Full text
Abstract:
The article positions mixed-method research (MMR) as a principled complementary research method to the traditional quantitative and qualitative research approaches. By situating MMR in an analysis of some of the common research paradigms, the article presents it as a natural choice in order to complement and cater to the increasingly complex needs of contemporary researchers. It proffers MMR as a flexible and adaptive conceptual framework for designing and conducting mixed methods research in a simplified manner. By explaining fundamental principles and major theoretical tenets of a mixed-methods approach, which involves both quantitative and qualitative data collection in response to research questions, it elucidates several benefits of adopting MMR since it integrates post-positivism as well as interpretivism frameworks. There is abundant literature around this research design aiming to provide researchers an understanding of the approach. Yet there is limited literature that provides illustrative guidance to research novices in comprehending mixed methods, understanding reasons for choosing it, and selecting an appropriate mixed methods design. Based on an analysis of some notable works in the field, this article provides an overview of mixed methods designs, discusses its main types, and explains challenges one can potentially encounter when in using them with a view to assisting early career researchers in particular and other researchers in general.
APA, Harvard, Vancouver, ISO, and other styles
29

Sharma, Dr Lok Raj, Sandesh Bidari, Dinesh Bidari, Sushil Neupane, and Rambabu Sapkota. "Exploring the Mixed Methods Research Design: Types, Purposes, Strengths, Challenges, and Criticisms." Global Academic Journal of Linguistics and Literature 5, no. 1 (January 20, 2023): 3–12. http://dx.doi.org/10.36348/gajll.2023.v05i01.002.

Full text
Abstract:
A mixed methods research design, which is a complex approach, combines both quantitative and qualitative data in a single study or succession of studies. This design can be particularly functional for exploring complex research questions that cannot be fully answered by using a single research design. Moreover, a mixed methods design is necessary to examine the relationships between different variables because examining the relationships between diverse variables is not viable just through a single research design. This design is required to complement and cater to the increasingly multifarious requirements of contemporary researchers. This article, which explores and discusses types, purposes, strengths, challenges and criticisms of the mixed methods research design as its objectives, stems from an analysis of some notable works in the field. It is grounded on the secondary qualitative data accumulated in the forms of words from journal articles and books related to the research designs. It assists the novices in the field of research in particular and other researchers in general by providing them with an overview of mixed methods design along with its types, such as convergent parallel, explanatory sequential, exploratory sequential, embedded, transformative and multi-phage designs.
APA, Harvard, Vancouver, ISO, and other styles
30

Będowska-Sójka, Barbara, and Agata Kliber. "Do mixed-data sampling models help forecast liquidity and volatility?" Przegląd Statystyczny 69, no. 2 (October 31, 2022): 1–19. http://dx.doi.org/10.5604/01.3001.0016.0363.

Full text
Abstract:
This paper aims to contribute to the existing studies on the Granger-causal relationship between volatility and liquidity in the stock market. We examine whether liquidity improves volatility forecasts and whether volatility allows the improvement of liquidity forecasts. The forecasts based on the mixed-data sampling models, MIDAS, are compared to those obtained from models based on daily data. Our results show that volatility and liquidity forecasts from MIDAS models outperform naive forecasts. On the other hand, the application of mixed-data sampling models does not significantly improve the performance of the forecasts of either liquidity or volatility based on a univariate autoregressive model or a vectorautoregressive one. We found that in terms of the forecasting ability, the VAR models and the AR models seem to perform equally well, as the differences in forecasting errors generated by these two types of models are not statistically significant.
APA, Harvard, Vancouver, ISO, and other styles
31

Pelt, Daniël, Kees Batenburg, and James Sethian. "Improving Tomographic Reconstruction from Limited Data Using Mixed-Scale Dense Convolutional Neural Networks." Journal of Imaging 4, no. 11 (October 30, 2018): 128. http://dx.doi.org/10.3390/jimaging4110128.

Full text
Abstract:
In many applications of tomography, the acquired data are limited in one or more ways due to unavoidable experimental constraints. In such cases, popular direct reconstruction algorithms tend to produce inaccurate images, and more accurate iterative algorithms often have prohibitively high computational costs. Using machine learning to improve the image quality of direct algorithms is a recently proposed alternative, for which promising results have been shown. However, previous attempts have focused on using encoder–decoder networks, which have several disadvantages when applied to large tomographic images, preventing wide application in practice. Here, we propose the use of the Mixed-Scale Dense convolutional neural network architecture, which was specifically designed to avoid these disadvantages, to improve tomographic reconstruction from limited data. Results are shown for various types of data limitations and object types, for both simulated data and large-scale real-world experimental data. The results are compared with popular tomographic reconstruction algorithms and machine learning algorithms, showing that Mixed-Scale Dense networks are able to significantly improve reconstruction quality even with severely limited data, and produce more accurate results than existing algorithms.
APA, Harvard, Vancouver, ISO, and other styles
32

Oksenoyd, E. E., V. A. Volkov, E. V. Oleynik, and G. P. Myasnikova. "KEROGEN TYPES OF BAZHENOV FORMATION BASED ON PYROLYSIS DATA AND THEIR COMPARISON WITH OIL PARAMETERS." Oil and Gas Studies, no. 5 (November 1, 2017): 34–43. http://dx.doi.org/10.31660/0445-0108-2017-5-34-43.

Full text
Abstract:
Based on pyrolytic data (3 995 samples from 208 wells) organic matter types of Bazhenov Formation are identified in the central part of Western Siberian basin. Zones of kerogen types I, II, III and mixed I-II and II-III are mapped. Content of sulfur, paraffins, resins and asphaltenes, viscosity, density, temperature and gas content in oils from Upper Jurassic and Lower Cretaceous sediments (3 806 oil pools) are mapped. Oil gradations are identified and distributed. The alternative model of zones of kerogen II and IIS types is presented. The established distributions of organic matter types can be used in basin modeling and in assessment of oil-and-gas bearing prospects.
APA, Harvard, Vancouver, ISO, and other styles
33

Geserbaatar, N. E., E. Nasanbat, and O. Lkhamjav. "THE IMPACT OF FOREST FIRE ON FOREST COVER TYPES IN MONGOLIA." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2020 (August 21, 2020): 693–98. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2020-693-2020.

Full text
Abstract:
Abstract. The objective of this study was the impact of forest fire on forest cover types. This study has identified non-forest and forest area that has seven forest class are included with cedar, pine, larch, birch, birch-pine mixed, birch-larch mixed and cedar-larch mixed, additionally, remote sensing imagery is applied. In contrast, Landsat imagery has been used several classification approaches. Moreover, the current classification has developments in segmentation and object-oriented techniques offer the suitable analysis to classify satellite data. In the object-oriented classification approach, images cluster to homogenous area as forest types by suitable parameters in some level. The accuracy analysis revealed that overall accuracy showed a good accuracy of determination (86.33 percent in 2000 and 93.75 percent in 2011) with regard to identify of the forest cover and type. Furthermore, these results suggest that the Landsat TM and ETM+ data can reliable detect the forest type based upon the segmentation and object-oriented techniques. In generally, our study area is high-risky region to forest fires. It is higher influence to forest cover and tree species and other ecosystems. Overall, wildfire of impact results showed that 25239 ha of forests were changed to burnt area and 52603 ha forests were changed to grassland.
APA, Harvard, Vancouver, ISO, and other styles
34

Stolz, Jörg, and Anaïd Lindemann. "The Titanic Game: Introducing Game Heuristics to Mixed Methods Theorizing and Data Analysis." Journal of Mixed Methods Research 14, no. 4 (November 12, 2019): 522–44. http://dx.doi.org/10.1177/1558689819885723.

Full text
Abstract:
Despite tremendous interest in social games and game studies, the potential of game heuristics for the field of mixed methods remains unknown. This article introduces game heuristics to mixed methods research, showing how it was used in a specific study on the survival probabilities on the Titanic. Specifically, we describe how game heuristics was used to create the explanandum, code and interpret the qualitative material, and set up and interpret the quantitative model. Furthermore, we show and explicate how game heuristics was used to construct seven types of meta-inferences. The Titanic data set is especially interesting, since it is routinely used for statistical mono-method teaching; however, it can be shown that a mixed methods approach leads to a better explanation.
APA, Harvard, Vancouver, ISO, and other styles
35

Ramdhan, Muhammad, Yulius Yulius, and Nindya Kania Oktaviana. "Distribution of Tide Type in Indonesian Waters Based on 7 Days Data Measurement of Ipasoet-BIG Station." Jurnal Segara 17, no. 2 (November 25, 2021): 117. http://dx.doi.org/10.15578/segara.v17i2.9342.

Full text
Abstract:
Tidal data is needed in the field of energy, marine navigation, coastal construction and other activities related to the oceans. Tidal phenomena occur due to the interaction of the earth with space objects. The sea level rise in coastal waters can be modeled by a harmonic function containing tidal constant numbers. From the constants formed can be calculated a Formzahl number that shows the type of tides that occur at the observation station. This paper tries to describe the distribution pattern of tidal types that exist in Indonesian waters based on data observation collected at station belong to the Geospatial Information Agency. The result is that there are 4 types of tides in Indonesian waters, with the most dominant distribution are mixed tide, prevailing semi diurnal typel.
APA, Harvard, Vancouver, ISO, and other styles
36

Baizid, A. R., and M. S. Alam. "Wigner Rotations of Different Types of Lorentz Transformations." Journal of Scientific Research 8, no. 3 (September 1, 2016): 249–58. http://dx.doi.org/10.3329/jsr.v8i3.27033.

Full text
Abstract:
We have studied Wigner rotations of different types of Lorenz Transformations according to the nature of movement of one inertial frame relative to the other inertial frame. When the motion is along any arbitrary direction then we can find the formulae for Wigner rotations using the velocity addition formulae for most general, mixed number, quaternion and geometric product Lorentz transformations. Finally we have used simulated data for applying the Wigner rotation formula in pion decay chainand concluded the result.
APA, Harvard, Vancouver, ISO, and other styles
37

Bhat, Chandra R., Abdul R. Pinjari, Subodh K. Dubey, and Amin S. Hamdi. "On accommodating spatial interactions in a Generalized Heterogeneous Data Model (GHDM) of mixed types of dependent variables." Transportation Research Part B: Methodological 94 (December 2016): 240–63. http://dx.doi.org/10.1016/j.trb.2016.09.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

RANI and R. B. MISRA. "ML ESTIMATES FOR CROW/AMSAA RELIABILITY GROWTH MODEL FOR GROUPED AND MIXED TYPES OF SOFTWARE FAILURE DATA." International Journal of Reliability, Quality and Safety Engineering 11, no. 04 (December 2004): 329–37. http://dx.doi.org/10.1142/s0218539304001555.

Full text
Abstract:
A number of software reliability growth models have been proposed into the literature for estimating reliability during software testing. Duane's model,7 originally proposed for hardware reliability is also used in estimating reliability of the software during development testing. Graphical interpretation of Duane's postulate subsequently was given a concrete stochastic basis by Crow,3 and provided a comprehensive treatment of this model in the context of reliability growth and demonstrated its elegant inferential aspects. Parameters of the Crow model have physical interpretation and can yield quantitative measure for reliability growth assessment. This paper proposes a simple and efficient procedure to determine parameters of Crow/AMSAA model using one dimensional bisection method for grouped/interval data, where failures are recorded at various time points. In addition this paper proposes a method to estimate parameters when there exist a mixture of grouped and individual (mixed or hybrid) data types. Proposed method's application is illustrated with numerical examples using both simulated and real software failure data.
APA, Harvard, Vancouver, ISO, and other styles
39

Munn, Ian A., Yushun Zhai, and David L. Evans. "Modeling Forest Fire Probabilities in the South Central United States Using FIA Data." Southern Journal of Applied Forestry 27, no. 1 (February 1, 2003): 11–17. http://dx.doi.org/10.1093/sjaf/27.1.11.

Full text
Abstract:
Abstract Factors influencing the probability of fire occurrence in the south central United States were investigated using a geographic information system (GIS) and a multinomial logit model. Forest Inventory and Analysis (FIA) data at the plot level were merged with census data at the census-tract level to create a data set containing demographic, geographic, and timber-related characteristics. A multinomial logit model was employed to estimate the relationships between plot characteristics and the probability of wildfires, prescribed fires, and fires of unknown origins. Wildfires occurred more frequently on public forests than industrial and nonindustrial private forests (NIPFs). The probability of wildfire increased with proximity to urban areas and “built-up” areas of 4 ha or more in size. Wildfires occurred more frequently in younger stands and in pine and mixed pine-hardwood types than in hardwood types. Prescribed fires occurred more frequently on public and industrial forests than on NIPFs. The probability of prescribed fires increased with proximity to roads, urban areas, built-up areas of 4 ha or more, and on flatter terrain, but was inversely related to population density. Fire was prescribed less frequently for pole-sized stands than sawtimber size stands and more frequently for pine and mixed pine-hardwood types than for hardwood types. Education levels and median household incomes of the surrounding census tract had no significant effects on the probability of any type of fire. South. J. Appl. For. 27(1):11–17.
APA, Harvard, Vancouver, ISO, and other styles
40

Zhang, Xinyan, Boyi Guo, and Nengjun Yi. "Zero-Inflated gaussian mixed models for analyzing longitudinal microbiome data." PLOS ONE 15, no. 11 (November 9, 2020): e0242073. http://dx.doi.org/10.1371/journal.pone.0242073.

Full text
Abstract:
Motivation The human microbiome is variable and dynamic in nature. Longitudinal studies could explain the mechanisms in maintaining the microbiome in health or causing dysbiosis in disease. However, it remains challenging to properly analyze the longitudinal microbiome data from either 16S rRNA or metagenome shotgun sequencing studies, output as proportions or counts. Most microbiome data are sparse, requiring statistical models to handle zero-inflation. Moreover, longitudinal design induces correlation among the samples and thus further complicates the analysis and interpretation of the microbiome data. Results In this article, we propose zero-inflated Gaussian mixed models (ZIGMMs) to analyze longitudinal microbiome data. ZIGMMs is a robust and flexible method which can be applicable for longitudinal microbiome proportion data or count data generated with either 16S rRNA or shotgun sequencing technologies. It can include various types of fixed effects and random effects and account for various within-subject correlation structures, and can effectively handle zero-inflation. We developed an efficient Expectation-Maximization (EM) algorithm to fit the ZIGMMs by taking advantage of the standard procedure for fitting linear mixed models. We demonstrate the computational efficiency of our EM algorithm by comparing with two other zero-inflated methods. We show that ZIGMMs outperform the previously used linear mixed models (LMMs), negative binomial mixed models (NBMMs) and zero-inflated Beta regression mixed model (ZIBR) in detecting associated effects in longitudinal microbiome data through extensive simulations. We also apply our method to two public longitudinal microbiome datasets and compare with LMMs and NBMMs in detecting dynamic effects of associated taxa.
APA, Harvard, Vancouver, ISO, and other styles
41

Cowpertwait, P. S. P. "Mixed rectangular pulses models of rainfall." Hydrology and Earth System Sciences 8, no. 5 (October 31, 2004): 993–1000. http://dx.doi.org/10.5194/hess-8-993-2004.

Full text
Abstract:
Abstract. A stochastic rainfall model, obtained as the superposition of independent Neyman-Scott Rectangular Pulses (NSRP), is proposed to provide a flexible parameterisation and general procedure for modelling rainfall. The methodology is illustrated using hourly data from Auckland, New Zealand, where the model is fitted to data collected for each calendar month over the period: 1966–1998. For data taken over the months April to August, two independent superposed NSRP processes are fitted, which may correspond to the existence of mixtures of convective and stratiform storm types for these months. The special case of the superposition of an independent NSRP process and a Poisson rectangular pulses process fits the data for January to March, whilst the original NSRP model (i.e. without superposition) fits the data for September to November. A simulation study verifies that the model performs well with respect to the distribution of annual totals, the proportion of dry periods, and extreme values. Keywords: stochastic processes; point processes; rainfall time series; Poisson cluster models
APA, Harvard, Vancouver, ISO, and other styles
42

Leech, Nancy L., Kathleen M. T. Collins, Qun G. Jiao, and Anthony J. Onwuegbuzie. "Mixed Research in Gifted Education." Journal for the Education of the Gifted 34, no. 6 (November 4, 2011): 860–75. http://dx.doi.org/10.1177/0162353211425095.

Full text
Abstract:
The purpose of this study was to identify the prevalence of mixed research techniques in empirical studies published in gifted education journals. During Phase 1, empirical full-text databases and relevant electronic bibliographic databases related to gifted education were searched during a time span of 10 to 18 years, resulting in the identification of 32 mixed research studies. During Phase 2, frequency data were compiled detailing the types of methods (quantitative, qualitative, mixed) implemented in empirical studies published in three leading gifted education research journals covering the time span of 5 years. A sequential mixed analysis was conducted on Phase 2 data, and results indicated that authors of empirical research articles utilized primarily quantitative methods. Among the 19 studies identified as mixed research, 5 utilized a mixed design that was categorized as a partially mixed, concurrent, dominant status design.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhang, Xinyan, and Nengjun Yi. "Fast zero-inflated negative binomial mixed modeling approach for analyzing longitudinal metagenomics data." Bioinformatics 36, no. 8 (January 6, 2020): 2345–51. http://dx.doi.org/10.1093/bioinformatics/btz973.

Full text
Abstract:
Abstract Motivation Longitudinal metagenomics data, including both 16S rRNA and whole-metagenome shotgun sequencing data, enhanced our abilities to understand the dynamic associations between the human microbiome and various diseases. However, analytic tools have not been fully developed to simultaneously address the main challenges of longitudinal metagenomics data, i.e. high-dimensionality, dependence among samples and zero-inflation of observed counts. Results We propose a fast zero-inflated negative binomial mixed modeling (FZINBMM) approach to analyze high-dimensional longitudinal metagenomic count data. The FZINBMM approach is based on zero-inflated negative binomial mixed models (ZINBMMs) for modeling longitudinal metagenomic count data and a fast EM-IWLS algorithm for fitting ZINBMMs. FZINBMM takes advantage of a commonly used procedure for fitting linear mixed models, which allows us to include various types of fixed and random effects and within-subject correlation structures and quickly analyze many taxa. We found that FZINBMM remarkably outperformed in computational efficiency and was statistically comparable with two R packages, GLMMadaptive and glmmTMB, that use numerical integration to fit ZINBMMs. Extensive simulations and real data applications showed that FZINBMM outperformed other previous methods, including linear mixed models, negative binomial mixed models and zero-inflated Gaussian mixed models. Availability and implementation FZINBMM has been implemented in the R package NBZIMM, available in the public GitHub repository http://github.com//nyiuab//NBZIMM. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
44

Li, Ziyi, Zhijin Wu, Peng Jin, and Hao Wu. "Dissecting differential signals in high-throughput data from complex tissues." Bioinformatics 35, no. 20 (March 23, 2019): 3898–905. http://dx.doi.org/10.1093/bioinformatics/btz196.

Full text
Abstract:
Abstract Motivation Samples from clinical practices are often mixtures of different cell types. The high-throughput data obtained from these samples are thus mixed signals. The cell mixture brings complications to data analysis, and will lead to biased results if not properly accounted for. Results We develop a method to model the high-throughput data from mixed, heterogeneous samples, and to detect differential signals. Our method allows flexible statistical inference for detecting a variety of cell-type specific changes. Extensive simulation studies and analyses of two real datasets demonstrate the favorable performance of our proposed method compared with existing ones serving similar purpose. Availability and implementation The proposed method is implemented as an R package and is freely available on GitHub (https://github.com/ziyili20/TOAST). Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
45

Zheng, Y. C., L. L. Li, and Y. P. Wang. "AN AEROSOL TYPE CLASSIFICATION METHOD BASED ON REMOTE SENSING DATA IN GUANGDONG, CHINA." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-3/W9 (October 25, 2019): 239–43. http://dx.doi.org/10.5194/isprs-archives-xlii-3-w9-239-2019.

Full text
Abstract:
Abstract. This paper provides an aerosol classification method based on remote sensing data in Guangdong, China in year 2010 and 2011. Aerosol Optical Depth, Angstrom Exponent and Ultraviolet Aerosol Index, as important properties of aerosols, are introduced into classification. Data of these three aerosol properties are integrated to establish a 3-dimension dataset, and k-means clustering algorithm with Mahalanobis distance is used to find out four clusters of the dataset, which respectively represents four aerosol types of urban-industrial, dust, biomass burning and mixed type. Prior knowledge about the understanding of each aerosol type is involved to associate each cluster with aerosol type. Temporal variation of the aerosol properties shows similarities between these two years. The proportion of aerosol types in different cities of Guangdong Province is also calculated, and result shows that in most cities urban-industrial aerosols takes the largest proportion while the mixed type aerosols takes the second place. Classification results prove that k-means cluster algorithm with Mahalanobis distance is a brief and efficient method for aerosol classification.
APA, Harvard, Vancouver, ISO, and other styles
46

Yang, Hanbing, Meichen Fu, Li Wang, and Feng Tang. "Mixed Land Use Evaluation and Its Impact on Housing Prices in Beijing Based on Multi-Source Big Data." Land 10, no. 10 (October 18, 2021): 1103. http://dx.doi.org/10.3390/land10101103.

Full text
Abstract:
The tense relationship between the supply and demand of land resources and the past spatial expansion of urban development in Beijing have brought many urban problems. Mixed land use is considered to be able to solve these urban problems as well as promote sustainable urban development. In this context, this study uses multi-source big data such as POI, OpenStreetMap and web crawler data to construct current land-use data of the area within the sixth ring road of Beijing, and then uses the entropy index and type number index to analyze the spatial distribution and aggregation characteristics of the mixed land-use level. Finally, a multi-scale geographically weighted regression is applied to explore the impact of the block and life circle scale mixed land use on housing prices. The results show that: (1) the accuracy of land use data obtained by using multi-source big data is high, and the consistency with the real land use situation is as high as 82.67%. (2) the mixed land use level in the study area is higher in the urban center and lower in the periphery of the city. However, it does not show the spatial distribution characteristics gradually decreasing with the increase of the distance from the urban center but shows that the area from the third to the fifth ring road is the highest. (3) the impact of block scale and life circle scale mixed land use on housing price is different. The type number index has a negative effect on the housing price in block scale mixed land use, while the entropy index has a positive effect on the housing price in life circle scale mixed land use. Based on the existing “bottom-up” individual-dominant development mode, the government of Beijing should issue relevant policies and documents to give “top-down” control and guidance in the future, so as to promote the maximization of the benefits of mixed land use. Furthermore, in the practice of mixed land use in Beijing, land use types should be reduced at the block scale and the area of different land use types should be balanced at the life circle scale.
APA, Harvard, Vancouver, ISO, and other styles
47

Dharmadhikari, Susheel, Riddhiman Raut, Asok Ray, and Amrita Basak. "A Unified Mixed Deep Neural Network for Fatigue Damage Detection in Components with Different Stress Concentrations." Applied Sciences 13, no. 3 (January 25, 2023): 1542. http://dx.doi.org/10.3390/app13031542.

Full text
Abstract:
The article presents a mixed deep neural network (DNN) approach for detecting micron-scale fatigue damage in high-strength polycrystalline aluminum alloys. Fatigue testing is conducted using a custom-designed apparatus integrated with a confocal microscope and a moving stage to accurately pinpoint the instance of micron-scale crack emergence. The specimens are monitored throughout the duration of the experiment using a pair of high-frequency ultrasonic transducers. The mixed DNN is trained with ultrasonic time-series data that are obtained from two sets of specimens categorized by different stress concentration factors. To understand the effects of mixing the data from both types of specimens, a parametric analysis is performed by varying the amount of training data from each specimen to develop a series of mixed DNNs. The mixed DNN, when tested on unseen data from both specimens, exhibits an accuracy of over 95%. This article, therefore, demonstrates a successful alternative to customized DNNs for new types, geometries, or stress concentration factors in the materials under consideration.
APA, Harvard, Vancouver, ISO, and other styles
48

Sun, Meng, and Yi Lu. "A Generalized Linear Mixed Model for Data Breaches and Its Application in Cyber Insurance." Risks 10, no. 12 (November 23, 2022): 224. http://dx.doi.org/10.3390/risks10120224.

Full text
Abstract:
Data breach incidents result in severe financial loss and reputational damage, which raises the importance of using insurance to manage and mitigate cyber related risks. We analyze data breach chronology collected by Privacy Rights Clearinghouse (PRC) since 2001 and propose a Bayesian generalized linear mixed model for data breach incidents. Our model captures the dependency between frequency and severity of cyber losses and the behavior of cyber attacks on entities across time. Risk characteristics such as types of breach, types of organization, entity locations in chronology, as well as time trend effects are taken into consideration when investigating breach frequencies. Estimations of model parameters are presented under Bayesian framework using a combination of Gibbs sampler and Metropolis–Hastings algorithm. Predictions and implications of the proposed model in enterprise risk management and cyber insurance rate filing are discussed and illustrated. We find that it is feasible and effective to use our proposed NB-GLMM for analyzing the number of data breach incidents with uniquely identified risk factors. Our results show that both geological location and business type play significant roles in measuring cyber risks. The outcomes of our predictive analytics can be utilized by insurers to price their cyber insurance products, and by corporate information technology (IT) and data security officers to develop risk mitigation strategies according to company’s characteristics.
APA, Harvard, Vancouver, ISO, and other styles
49

Rokhman, Nur. "A Survey on Mixed-Attribute Outlier Detection Methods." CommIT (Communication and Information Technology) Journal 13, no. 1 (May 31, 2019): 39. http://dx.doi.org/10.21512/commit.v13i1.5558.

Full text
Abstract:
In the data era, outlier detection methods play an important role. The existence of outliers can provide clues to the discovery of new things, irregularities in a system, or illegal intruders. Based on the data, outlier detection methods can be classified into numerical, categorical, or mixed-attribute data. However, the study of the outlier detection methods is generally conducted for numerical data. Meanwhile, many real-life facts are presented in mixed-attribute data. In this paper, the researcher presents a survey of outlier detection methods for mixed-attribute data. The methods are classified into four types, namely, categorized, enumerated, combined, and mixed outlier detection methods for mixed-attribute data. Through this classification, the methods can be easily analyzed and improved by applying appropriate functions.
APA, Harvard, Vancouver, ISO, and other styles
50

Lazar, Alina, Ling Jin, C. Anna Spurlock, Kesheng Wu, Alex Sim, and Annika Todd. "Evaluating the Effects of Missing Values and Mixed Data Types on Social Sequence Clustering Using t-SNE Visualization." Journal of Data and Information Quality 11, no. 2 (May 9, 2019): 1–22. http://dx.doi.org/10.1145/3301294.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography