Log in

Relevant bibliographies by topics / Probabilities – Data processing / Journal articles

To see the other types of publications on this topic, follow the link: Probabilities – Data processing.

Journal articles on the topic 'Probabilities – Data processing'

Author: Grafiati

Published: 4 June 2021

Last updated: 30 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Probabilities – Data processing.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Vaidogas, Egidijus Rytas. "Bayesian Processing of Data on Bursts of Pressure Vessels." Information Technology and Control 50, no. 4 (December 16, 2021): 607–26. http://dx.doi.org/10.5755/j01.itc.50.4.29690.

Full text

Abstract:

Two alternative Bayesian approaches are proposed for the prediction of fragmentation of pressure vessels triggered off by accidental explosions (bursts) of these containment structures. It is shown how to carry out this prediction with post-mortem data on fragment numbers counted after past explosion accidents. Results of the prediction are estimates of probabilities of individual fragment numbers. These estimates are expressed by means of Bayesian prior or posterior distributions. It is demonstrated how to elicit the prior distributions from relatively scarce post-mortem data on vessel fragmentations. Specifically, it is suggested to develop priors with two Bayesian models known as compound Poisson-gamma and multinomial-Dirichlet probability distributions. The available data is used to specify non-informative prior for Poisson parameter that is subsequently transformed into priors of individual fragment number probabilities. Alternatively, the data is applied to a specification of Dirichlet concentration parameters. The latter priors directly express epistemic uncertainty in the fragment number probabilities. Example calculations presented in the study demonstrate that the suggested non-informative prior distributions are responsive to updates with scarce data on vessel explosions. It is shown that priors specified with Poisson-gamma and multinomial-Dirichlet models differ tangibly; however, this difference decreases with increasing amount of new data. For the sake of brevity and concreteness, the study was limited to fire induced vessel bursts known as boiling liquid expanding vapour explosions (BLEVEs).

APA, Harvard, Vancouver, ISO, and other styles

2

Ivanov, A. I., E. N. Kuprianov, and S. V. Tureev. "Neural network integration of classical statistical tests for processing small samples of biometrics data." Dependability 19, no. 2 (June 16, 2019): 22–27. http://dx.doi.org/10.21683/1729-2646-2019-19-2-22-27.

Full text

Abstract:

The Aim of this paper is to increase the power of statistical tests through their joint application to reduce the requirement for the size of the test sample. Methods. It is proposed to combine classical statistical tests, i.e. chi square, Cram r-von Mises and Shapiro-Wilk by means of using equivalent artificial neurons. Each neuron compares the input statistics with a precomputed threshold and has two output states. That allows obtaining three bits of binary output code of a network of three artificial neurons. Results. It is shown that each of such criteria on small samples of biometric data produces high values of errors of the first and second kind in the process of normality hypothesis testing. Neural network integration of three tests under consideration enables a significant reduction of the probabilities of errors of the first and second kind. The paper sets forth the results of neural network integration of pairs, as well as triples of statistical tests under consideration. Conclusions. Expected probabilities of errors of the first and second kind are predicted for neural network integrations of 10 and 30 classical statistical tests for small samples that contain 21 tests. An important element of the prediction process is the symmetrization of the problem, when the probabilities of errors of the first and second kind are made identical and averaged out. Coefficient modules of pair correlation of output states are averaged out as well by means of artificial neuron adders. Only in this case the connection between the number of integrated tests and the expected probabilities of errors of the first and second kind becomes linear in logarithmic coordinates.

APA, Harvard, Vancouver, ISO, and other styles

3

Romansky, Radi. "Mathematical Model Investigation of a Technological Structure for Personal Data Protection." Axioms 12, no. 2 (January 18, 2023): 102. http://dx.doi.org/10.3390/axioms12020102.

Full text

Abstract:

The contemporary digital age is characterized by the massive use of different information technologies and services in the cloud. This raises the following question: “Are personal data processed correctly in global environments?” It is known that there are many requirements that the Data Controller must perform. For this reason, this article presents a point of view for transferring some activities for personal data processing from a traditional system to a cloud environment. The main goal is to investigate the differences between the two versions of data processing. To achieve this goal, a preliminary deterministic formalization of the two cases using a Data Flow Diagram is made. The second phase is the organization of a mathematical (stochastic) model investigation on the basis of a Markov chain apparatus. Analytical models are designed, and their solutions are determined. The final probabilities for important states are determined based on an analytical calculation, and the higher values for the traditional version are defined for data processing in registers (“2”: access for write/read −0.353; “3”: personal data updating −0.212). The investigation of the situations based on cloud computing determines the increasing probability to be “2”. Discussion of the obtained assessment based on a graphical presentation of the analytical results is presented, which permits us to show the differences between the final probabilities for the states in the two versions of personal data processing.

APA, Harvard, Vancouver, ISO, and other styles

4

Tkachenko, Kirill. "PROVIDING A DEPENDABLE OPERATION OF THE DATA PROCESSING SYSTEM WITH INTERVAL CHANGES IN THE FLOW CHARACTERISTICS BASED ON ANALYTICAL SIMULATIONS." Automation and modeling in design and management 2021, no. 3-4 (December 30, 2021): 25–30. http://dx.doi.org/10.30987/2658-6436-2021-3-4-25-30.

Full text

Abstract:

The article proposes a new approach for adjusting the parameters of computing nodes being a part of a data processing system based on analytical simulation of a queuing system with subsequent estimation of probabilities of hypotheses regarding the computing node state. Methods of analytical modeling of queuing systems and mathematical statistics are used. The result of the study is a mathematical model for assessing the information situation for a computing node, which differs from the previously published system model used. Estimation of conditional probabilities of hypotheses concerning adequate data processing by a computing node allows making a decision on the need of adjusting the parameters of a computing node. This adjustment makes it possible to improve the efficiency of working with tasks on the computing node of the data processing system. The implementation of the proposed model for adjusting the parameters of the computer node of the data processing system increases both the efficiency of process applications on the node and, in general, the efficiency of its operation. The application of the approach to all computing nodes of the data processing system increases the dependability of the system as a whole.

APA, Harvard, Vancouver, ISO, and other styles

5

Groot, Perry, Christian Gilissen, and Michael Egmont-Petersen. "Error probabilities for local extrema in gene expression data." Pattern Recognition Letters 28, no. 15 (November 2007): 2133–42. http://dx.doi.org/10.1016/j.patrec.2007.06.017.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Čajka, Radim, and Martin Krejsa. "Measured Data Processing in Civil Structure Using the DOProC Method." Advanced Materials Research 859 (December 2013): 114–21. http://dx.doi.org/10.4028/www.scientific.net/amr.859.114.

Full text

Abstract:

This paper describes the use of measured values in the probabilistic tasks by means of the new method which is under development now - Direct Optimized Probabilistic Calculation (DOProC). This method has been used to solve a number of probabilistic tasks. DOProC has been applied in ProbCalc a part of this software is a module for entering and assessing the measured data. The software can read values saved in a text file and can create histograms with non-parametric (empirical) distribution of the probabilities. In case of the parametric distribution, it is possible to make selection from among 24 defined types and specify the best choice, using the coefficient of determination. This approach has been used, for instance, for modelling and experimental validation of reliability of an additionally prestressed masonry construction.

APA, Harvard, Vancouver, ISO, and other styles

7

Chervyakov, N. I., P. A. Lyakhov, and A. R. Orazaev. "3D-generalization of impulse noise removal method for video data processing." Computer Optics 44, no. 1 (February 2020): 92–100. http://dx.doi.org/10.18287/2412-6179-co-577.

Full text

Abstract:

The paper proposes a generalized method of adaptive median impulse noise filtering for video data processing. The method is based on the combined use of iterative processing and transformation of the result of median filtering based on the Lorentz distribution. Four different combinations of algorithmic blocks of the method are proposed. The experimental part of the paper presents the results of comparing the quality of the proposed method with known analogues. Video distorted by impulse noise with pixel distortion probabilities from 1% to 99% inclusive was used for the simulation. Numerical assessment of the quality of cleaning video data from noise based on the mean square error (MSE) and structural similarity (SSIM) showed that the proposed method shows the best result of processing in all the considered cases, compared with the known approaches. The results obtained in the paper can be used in practical applications of digital video processing, for example, in systems of video surveillance, identification systems and control of industrial processes.

APA, Harvard, Vancouver, ISO, and other styles

8

Li, Qiude, Qingyu Xiong, Shengfen Ji, Junhao Wen, Min Gao, Yang Yu, and Rui Xu. "Using fine-tuned conditional probabilities for data transformation of nominal attributes." Pattern Recognition Letters 128 (December 2019): 107–14. http://dx.doi.org/10.1016/j.patrec.2019.08.024.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Jain, Kirti. "Sentiment Analysis on Twitter Airline Data." International Journal for Research in Applied Science and Engineering Technology 9, no. VI (June 30, 2021): 3767–70. http://dx.doi.org/10.22214/ijraset.2021.35807.

Full text

Abstract:

Sentiment analysis, also known as sentiment mining, is a submachine learning task where we want to determine the overall sentiment of a particular document. With machine learning and natural language processing (NLP), we can extract the information of a text and try to classify it as positive, neutral, or negative according to its polarity. In this project, We are trying to classify Twitter tweets into positive, negative, and neutral sentiments by building a model based on probabilities. Twitter is a blogging website where people can quickly and spontaneously share their feelings by sending tweets limited to 140 characters. Because of its use of Twitter, it is a perfect source of data to get the latest general opinion on anything.

APA, Harvard, Vancouver, ISO, and other styles

10

Buhmann, Joachim, and Hans Kühnel. "Complexity Optimized Data Clustering by Competitive Neural Networks." Neural Computation 5, no. 1 (January 1993): 75–88. http://dx.doi.org/10.1162/neco.1993.5.1.75.

Full text

Abstract:

Data clustering is a complex optimization problem with applications ranging from vision and speech processing to data transmission and data storage in technical as well as in biological systems. We discuss a clustering strategy that explicitly reflects the tradeoff between simplicity and precision of a data representation. The resulting clustering algorithm jointly optimizes distortion errors and complexity costs. A maximum entropy estimation of the clustering cost function yields an optimal number of clusters, their positions, and their cluster probabilities. Our approach establishes a unifying framework for different clustering methods like K-means clustering, fuzzy clustering, entropy constrained vector quantization, or topological feature maps and competitive neural networks.

APA, Harvard, Vancouver, ISO, and other styles

11

Batchelder, William H., Xiangen Hu, and Jared B. Smith. "Multinomial Processing Tree Models for Discrete Choice." Zeitschrift für Psychologie / Journal of Psychology 217, no. 3 (January 2009): 149–58. http://dx.doi.org/10.1027/0044-3409.217.3.149.

Full text

Abstract:

This paper shows how to develop new multinomial processing tree (MPT) models for discrete choice, and in particular binary choice. First it reviews the history of discrete choice with special attention to Duncan Luce’s book Individual Choice Behavior. Luce’s choice axiom leads to the Bradley-Terry-Luce (BTL) paired-comparison model which is the basis of logit models of discrete choice used throughout the social and behavioral sciences. It is shown that a reparameterization of the BTL model is represented by choice probabilities generated from a finite state Markov chain, and this representation is closely related to the rooted tree structure of MPT models. New MPT models of binary choice can be obtained by placing restrictions on this representation of the BTL model. Several of these new MPT models for paired comparisons are described, compared to the BTL model, and applied to data from a replicated round-robin data structure.

APA, Harvard, Vancouver, ISO, and other styles

12

Behrendt, Marco, Marius Bittner, Liam Comerford, Michael Beer, and Jianbing Chen. "Relaxed power spectrum estimation from multiple data records utilising subjective probabilities." Mechanical Systems and Signal Processing 165 (February 2022): 108346. http://dx.doi.org/10.1016/j.ymssp.2021.108346.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Kreinovich, Vladik, Hung T. Nguyen, and Songsak Sriboonchitta. "Need for Data Processing Naturally Leads to Fuzzy Logic (and Neural Networks): Fuzzy Beyond Experts and Beyond Probabilities." International Journal of Intelligent Systems 31, no. 3 (October 12, 2015): 276–93. http://dx.doi.org/10.1002/int.21785.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Lachikhina, A. B., and A. A. Petrakov. "Data integrity as a criterion for assessing the security of corporate information systems resources." Issues of radio electronics, no. 11 (November 20, 2019): 77–81. http://dx.doi.org/10.21778/2218-5453-2019-11-77-81.

Full text

Abstract:

The paper considers the information resources protection assessing in the information security management in an industrial enterprise. The main aspects of information security as a process are given. It is proposed to use data integrity as a criterion for resources security assessing of the corporate information system, defined as the probability of a possible violation of the integrity in the corresponding process of processing information. The groups of technological operations related to the process of information processing are considered. An approximate set of probabilities of possible events that contribute to maintaining data integrity is given. For the mathematical formulation of the problem, each of the events is considered as an alternative with a given optimization criterion. The introduction of a target function for a variety of alternatives allows you to select the best one and determine the cause of the integrity violation. The dependence of the total probability of integrity violation on a priori probability distribution is noted.

APA, Harvard, Vancouver, ISO, and other styles

15

Andriyanov, Nikita A., Madina-Bonu R. Atakhodzhaeva, and Evgeny I. Borodin. "Mathematical modeling of recommender system and data processing of a telecommunications company using machine learning models." Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control & Radioelectronics 22, no. 2 (April 2022): 17–28. http://dx.doi.org/10.14529/ctcr220202.

Full text

Abstract:

The purpose of the study is to develop data modeling methods for projecting recommender algorithms using doubly stochastic autoregressive models of random processes and checking their adequacy by applying machine learning algorithms to cluster users in a simulated data set and predict probabilities of interest. Research methods. The article discusses the methods used in the construction of recommender systems. At the same time, the problem of modeling user behavior using a doubly stochastic model is considered. This model is proposed for generating artificial data. The doubly stochastic model allows generating non-stationary processes, thus creating users with different probabilistic properties in different groups of objects of interest. After that, artificially created users (and their activity) are clustered based on a modified K-means algorithm. The main modification is the need for automatic pre-estimation of the number of clusters, and not its choice by a person. Next, the behavior of representatives of each user group for new events is modeled. Based on the generated information and training data, the problem of predictiing and ranking the services offered is solved. At the same time, at the first stage, the use of regression models is sufficient to assign users to a group and form offers for this user. Results of the study. On the training data in 2 clusters, high determination indices were achieved, which indicates approximately 90% of the explained variance when using the proposed doubly stochastic model. Particular attention is paid to the work of modern recommender systems on the example of the Disco system developed by Yandex. In addition, pre-processing and preliminary analysis of data from the real sector was performed, namely, the data of a telecommunications company are being studied. For the purpose of issuing relevant proposals for communication services, a test recommender system has been developed. Conclusion. Thus, the main results of the work include a mathematical model that simulates the reaction of users to various services, as well as a logistic regression model used to predict the probability of a user's interest in a new service. Based on predicted probabilities, it is not difficult to rank new proposals. Approbation on the synthesized data showed the high efficiency of the model.

APA, Harvard, Vancouver, ISO, and other styles

16

Kalakoti, Yogesh, Shashank Yadav, and Durai Sundar. "SurvCNN: A Discrete Time-to-Event Cancer Survival Estimation Framework Using Image Representations of Omics Data." Cancers 13, no. 13 (June 22, 2021): 3106. http://dx.doi.org/10.3390/cancers13133106.

Full text

Abstract:

The utility of multi-omics in personalized therapy and cancer survival analysis has been debated and demonstrated extensively in the recent past. Most of the current methods still suffer from data constraints such as high-dimensionality, unexplained interdependence, and subpar integration methods. Here, we propose SurvCNN, an alternative approach to process multi-omics data with robust computer vision architectures, to predict cancer prognosis for Lung Adenocarcinoma patients. Numerical multi-omics data were transformed into their image representations and fed into a Convolutional Neural network with a discrete-time model to predict survival probabilities. The framework also dichotomized patients into risk subgroups based on their survival probabilities over time. SurvCNN was evaluated on multiple performance metrics and outperformed existing methods with a high degree of confidence. Moreover, comprehensive insights into the relative performance of various combinations of omics datasets were probed. Critical biological processes, pathways and cell types identified from downstream processing of differentially expressed genes suggested that the framework could elucidate elements detrimental to a patient’s survival. Such integrative models with high predictive power would have a significant impact and utility in precision oncology.

APA, Harvard, Vancouver, ISO, and other styles

17

PEELER, JAMES T., and V. KELLY BUNNING. "Hazard Assessment of Listeria monocytogenes in the Processing of Bovine Milk." Journal of Food Protection 57, no. 8 (August 1, 1994): 689–97. http://dx.doi.org/10.4315/0362-028x-57.8.689.

Full text

Abstract:

The steps in Grade A production of bovine milk for human consumption were assessed. A cumulative distribution of values, based on published data, was used to evaluate steps for milking, storage, transportation and pasteurization. Conservative estimates of parameters in the distributions were used to compute concentrations and probabilities. Under normal operations, the probability was less than 2 in 100 that one Listeria monocytogenes cell occurs in every 2 gallons of milk processed at exactly 71.7°C for 15 s. The probability was less than 2 in 100 that one cell occurs in 3.8 × 1010 gallons processed at 74.4°C for 20 s.

APA, Harvard, Vancouver, ISO, and other styles

18

Lee, Se-Jin, Byung-Jae Park, Jong-Hwan Lim, and Dong-Woo Cho. "Feature map management for mobile robots in dynamic environments." Robotica 28, no. 1 (April 28, 2009): 97–106. http://dx.doi.org/10.1017/s026357470900561x.

Full text

Abstract:

SUMMARYThis paper presents a new approach to the management of the environmental map for mobile robots in dynamic environments. The environmental map is built of primitive features, such as lines, points, and even circles, extracted from ambiguous data captured by the robot's sonar sensor ring. The feature map must be managed because the indoor surroundings where mobile robots operate are continuously changing due to nonstationary objects, such as wastebaskets, tables, and people. The features are processed by trimming, division, or removal, depending on the dynamic circumstances. All processing refers to the occupancy probabilities of grid squares generated for the map features. The occupancy probabilities of the squares are updated using the Bayesian updating model with the sonar sensor data. Experimental results demonstrate the validity of the proposed method.

APA, Harvard, Vancouver, ISO, and other styles

19

Levin, E., W. Roland, R. Habibi, Z. An, and R. Shults. "RAPID VISUAL PRESENTATION TO SUPPORT GEOSPATIAL BIG DATA PROCESSING." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B4-2020 (August 25, 2020): 463–70. http://dx.doi.org/10.5194/isprs-archives-xliii-b4-2020-463-2020.

Full text

Abstract:

Abstract. Given the limited number of human GIS/image analysts at any organization, use of their time and organizational resources is important, especially in light of Big Data application scenarios when organizations may be overwhelmed with vast amounts of geospatial data. The current manuscript is devoted to the description of experimental research outlining the concept of Human-Computer Symbiosis where computers perform tasks, such as classification on a large image dataset, and, in sequence, humans perform analysis with Brain-Computer Interfaces (BCIs) to classify those images that machine learning had difficulty with. The addition of the BCI analysis is to utilize the brain’s ability to better answer questions like: “Is the object in this image the object being sought?” In order to determine feasibility of such a system, a supervised multi-layer convolutional neural network (CNN) was trained to detect the difference between ‘ships’ and ‘no ships’ from satellite imagery data. A prediction layer was then added to the trained model to output the probability that a given image was within each of those two classifications. If the probabilities were within one standard deviation of the mean of a gaussian distribution centered at 0.5, they would be stored in a separate dataset for Rapid Serial Visual Presentations (RSVP), implemented with PsyhoPy, to a human analyst using a low cost EMOTIV “Insight” EEG BCI headset. During the RSVP phase, hundreds of images per minute can be sequentially demonstrated. At such a pace, human analysts are not capable of making any conscious decisions about what is in each image; however, the subliminal “aha-moment” still can be detected by the headset. The discovery of these moments are parsed out by exposition of Event Related Potentials (ERPs), specifically the P300 ERPs. If a P300 ERP is generated for detection of a ship, then the relevant image would be moved to its rightful designation dataset; otherwise, if the image classification is still unclear, it is set aside for another RSVP iteration where the time afforded to the analyst for observation of each image is increased each time. If classification is still uncertain after a respectable amount of RSVP iterations, the images in question would be located within the grid matrix of its larger image scene. The adjacent images to those of interest on the grid would then be added to the presentation to give an analyst more contextual information via the expanded field of view. If classification is still uncertain, one final expansion of the field of view is afforded. Lastly, if somehow the classification of the image is indeterminable, the image is stored in an archive dataset.

APA, Harvard, Vancouver, ISO, and other styles

20

Atkinson, Dale, Mark Chladil, Volker Janssen, and Arko Lucieer. "Implementation of quantitative bushfire risk analysis in a GIS environment." International Journal of Wildland Fire 19, no. 5 (2010): 649. http://dx.doi.org/10.1071/wf08185.

Full text

Abstract:

Bushfires pose a significant threat to lives and property. Fire management authorities aim to minimise this threat by employing risk-management procedures. This paper proposes a process of implementing, in a Geographic Information System environment, contemporary integrated approaches to bushfire risk analysis that incorporate the dynamic effects of bushfires. The system is illustrated with a case study combining ignition, fire behaviour and fire propagation models with climate, fuel, terrain, historical ignition and asset data from Hobart, Tasmania, and its surroundings. Many of the implementation issues involved with dynamic risk modelling are resolved, such as increasing processing efficiency and quantifying probabilities using historical data. A raster-based, risk-specific bushfire simulation system is created, using a new, efficient approach to model fire spread and a spatiotemporal algorithm to estimate spread probabilities. We define a method for modelling ignition probabilities using representative conditions in order to manage large fire weather datasets. Validation of the case study shows that the system can be used efficiently to produce a realistic output in order to assess the risk posed by bushfire. The model has the potential to be used as a reliable near-real-time tool for assisting fire management decision making.

APA, Harvard, Vancouver, ISO, and other styles

21

Jia, Fu Quan, Zhang Wei He, Zhu Jun Tian, Zhao Bo Chen, Hong Cheng Wang, Yi Meng Chen, and Bao Jun Jiang. "The Application of Mathematical Statistical Method on Effectiveness Evaluation of a Low-Strength Complex Wastewater Treatment System." Advanced Materials Research 838-841 (November 2013): 2532–38. http://dx.doi.org/10.4028/www.scientific.net/amr.838-841.2532.

Full text

Abstract:

Mathematical statistical method (MSM) is a very important wastewater treatment plant data processing analysis tool. In this study, MSM had been used to evaluate the efficiency, stability and reliability of the low-strengthen complex wastewater treatment system. Results showed that while LSCWWTs upgrading in the future, each influent parameter could be set as the follows: COD = 738 mg/L, BOD = 300 mg/L, SS = 454 mg/L, TN = 64.3 mg/L, NH3-N = 65.2 mg/L and TP = 7.65 mg/L. For effluent of LSCWWTs, the stabilities of COD, BOD, SS and TN were all A, TP was B, and NH3-N was D. For effluent COD, TN and NH3-N reached the first class discharge standard of A, the reliability probabilities were 83.89%, 80.23% and 99.43%, respectively. And the reliability probabilities of effluent WQPs reached the first class discharge standard of B were all more than 98%.

APA, Harvard, Vancouver, ISO, and other styles

22

Wang, Diangang, Shuo Song, Wei Gan, and Kun Huang. "An algorithm for detecting abnormal electricity mode of power users." MATEC Web of Conferences 189 (2018): 04001. http://dx.doi.org/10.1051/matecconf/201818904001.

Full text

Abstract:

In order to reduce the non-technical loss and reduce the operating cost of the power company, an abnormal power consumption detection algorithm is proposed. The algorithm includes feature extraction, principal component analysis, grid processing, local outliers, and so on. Firstly, we extract several feature quantities that characterize the user's power consumption pattern, and map the X users to the two-dimensional plane by principal component analysis. Data visualization and easy to calculate local outliers, and grid processing techniques to filter out data points in low density regions. The algorithm is used to reduce the number of training samples in the power user data set, and to output the anomalies and probabilities of all users' behavior. The experimental results show that the use of the sorting only need to detect the anomaly of a few users can find a large number of abnormal users, significantly improve the efficiency of the algorithm.

APA, Harvard, Vancouver, ISO, and other styles

23

GERSTOFT, PETER. "GLOBAL INVERSION BY GENETIC ALGORITHMS FOR BOTH SOURCE POSITION AND ENVIRONMENTAL PARAMETERS." Journal of Computational Acoustics 02, no. 03 (September 1994): 251–66. http://dx.doi.org/10.1142/s0218396x94000178.

Full text

Abstract:

The data set from the Workshop on Acoustic Models in Signal Processing (May 1993) is inverted in order to find both the environmental parameters and the source position, Genetic algorithms are used for the optimization. When using genetic algorithms the responses from many environmental parameter sets are computed in order to estimate the solution. All these samples of the parameter space are used to estimate the a posteriori probabilities of the model parameters. Thus the uniqueness and uncertainty of the model parameters are assessed.

APA, Harvard, Vancouver, ISO, and other styles

24

Reda, Islam, Ashraf Khalil, Mohammed Elmogy, Ahmed Abou El-Fetouh, Ahmed Shalaby, Mohamed Abou El-Ghar, Adel Elmaghraby, Mohammed Ghazal, and Ayman El-Baz. "Deep Learning Role in Early Diagnosis of Prostate Cancer." Technology in Cancer Research & Treatment 17 (January 1, 2018): 153303461877553. http://dx.doi.org/10.1177/1533034618775530.

Full text

Abstract:

The objective of this work is to develop a computer-aided diagnostic system for early diagnosis of prostate cancer. The presented system integrates both clinical biomarkers (prostate-specific antigen) and extracted features from diffusion-weighted magnetic resonance imaging collected at multiple b values. The presented system performs 3 major processing steps. First, prostate delineation using a hybrid approach that combines a level-set model with nonnegative matrix factorization. Second, estimation and normalization of diffusion parameters, which are the apparent diffusion coefficients of the delineated prostate volumes at different b values followed by refinement of those apparent diffusion coefficients using a generalized Gaussian Markov random field model. Then, construction of the cumulative distribution functions of the processed apparent diffusion coefficients at multiple b values. In parallel, a K-nearest neighbor classifier is employed to transform the prostate-specific antigen results into diagnostic probabilities. Finally, those prostate-specific antigen–based probabilities are integrated with the initial diagnostic probabilities obtained using stacked nonnegativity constraint sparse autoencoders that employ apparent diffusion coefficient–cumulative distribution functions for better diagnostic accuracy. Experiments conducted on 18 diffusion-weighted magnetic resonance imaging data sets achieved 94.4% diagnosis accuracy (sensitivity = 88.9% and specificity = 100%), which indicate the promising results of the presented computer-aided diagnostic system.

APA, Harvard, Vancouver, ISO, and other styles

25

Su, Daiqin, Robert Israel, Kunal Sharma, Haoyu Qi, Ish Dhand, and Kamil Brádler. "Error mitigation on a near-term quantum photonic device." Quantum 5 (May 4, 2021): 452. http://dx.doi.org/10.22331/q-2021-05-04-452.

Full text

Abstract:

Photon loss is destructive to the performance of quantum photonic devices and therefore suppressing the effects of photon loss is paramount to photonic quantum technologies. We present two schemes to mitigate the effects of photon loss for a Gaussian Boson Sampling device, in particular, to improve the estimation of the sampling probabilities. Instead of using error correction codes which are expensive in terms of their hardware resource overhead, our schemes require only a small amount of hardware modifications or even no modification. Our loss-suppression techniques rely either on collecting additional measurement data or on classical post-processing once the measurement data is obtained. We show that with a moderate cost of classical post processing, the effects of photon loss can be significantly suppressed for a certain amount of loss. The proposed schemes are thus a key enabler for applications of near-term photonic quantum devices.

APA, Harvard, Vancouver, ISO, and other styles

26

Zhou, X. Y., and L. Sun. "A NEW CLOUD SHADOW DETECTION ALGORITHM BASED ON PRIOR LAND TYPE DATABASE SUPPORT." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B3-2020 (August 21, 2020): 849–51. http://dx.doi.org/10.5194/isprs-archives-xliii-b3-2020-849-2020.

Full text

Abstract:

Abstract. Cloud shadow detection is one of the basic step for remote sensing image processing. The threshold method is commonly used currently because of its easy implementation and good accuracy. Aiming at the problems that common threshold settings are difficult to meet the complex surface conditions and the results is two-value, this paper proposes a detection method of cloud shadow pixels based on land cover data support by calculating shadow probabilities using Landsat 8 data as an example. Then, a validation with visual interpretation is used to verify the accuracy. The results show that the method can achieve high cloud shadow detection results.

APA, Harvard, Vancouver, ISO, and other styles

27

Raskin, Lev, Oksana Sira, Larysa Sukhomlyn, and Roman Korsun. "Analysis of semi-Markov systems with fuzzy initial data." EUREKA: Physics and Engineering, no. 2 (March 31, 2022): 128–42. http://dx.doi.org/10.21303/2461-4262.2022.002346.

Full text

Abstract:

In real operating conditions of complex systems, random changes in their possible states occur in the course of their operation. The traditional approach to describing such systems uses Markov models. However, the real non-deterministic mechanism that controls the duration of the system's stay in each of its possible states predetermines the insufficient adequacy of the models obtained in this case. This circumstance makes it expedient to consider models that are more general than Markov ones. In addition, when choosing such models, one should take into account the fundamental often manifested feature of the statistical material actually used in the processing of an array of observations, their small sample. All this, taken together, makes it relevant to study the possibility of developing less demanding, tolerant models of the behavior of complex systems. A method for the analysis of systems described under conditions of initial data uncertainty by semi-Markov models is proposed. The main approaches to the description of this uncertainty are considered: probabilistic, fuzzy, and bi-fuzzy. A procedure has been developed for determining the membership functions of fuzzy numbers based on the results of real data processing. Next, the following tasks are solved sequentially. First, the vector of stationary state probabilities of the Markov chain embedded in the semi-Markov process is found. Then, a set of expected values for the duration of the system's stay in each state before leaving it is determined, after which the required probability distribution of the system states is calculated. The proposed method has been developed to solve the problem in the case when the parameters of the membership functions of fuzzy initial data cannot be clearly estimated under conditions of a small sample

APA, Harvard, Vancouver, ISO, and other styles

28

Lo, James Ting-Ho, and Bryce Mackey-Williams Carey. "A Cortical Learning Machine for Learning Real-Valued and Ranked Data." International Journal of Clinical Medicine and Bioengineering 1, no. 1 (December 30, 2021): 12–24. http://dx.doi.org/10.35745/ijcmb2021v01.01.0003.

Full text

Abstract:

The cortical learning machine (CLM) introduced in [1-3] is a low-order computational model of the neocortex. It has the real-time, photogragraphic, unsupervised, and hierarchical learning capabilities, which existing learning machines such as the multilayer perceptron and convolutional neural network do not have. The CLM is a network of processing units (PUs) each comprising novel computational models of dendrites (for encoding), synapses (for storing code covariance matrices), spiking/nonspiking somas (for evaluating empirical probabilities and generating spikes), and unsupervised/supervised Hebbian learning schemes. In this paper, the masking matrix in the CLM in [1-3] is generalized to enable the CLM to learn ranked and real-valued data in the form of the binary numbers and unary (thermometer) codes. The general masking matrix assigns weights to the bits in the binary and unary code to reflect their relative significances. Numerical examples are provided to illustrate that a single PU with the general masking matrix is a pattern recognizer with an efficacy comparable to those of leading statistical and machine learning methods, showing the potential of CLMs with multiple PUs especially in consideration of the aforementioned capabilities of the CLM.

APA, Harvard, Vancouver, ISO, and other styles

29

Caballero-Águila, R., A. Hermoso-Carazo, and J. Linares-Pérez. "Least-Squares Filtering Algorithm in Sensor Networks with Noise Correlation and Multiple Random Failures in Transmission." Mathematical Problems in Engineering 2017 (2017): 1–9. http://dx.doi.org/10.1155/2017/1570719.

Full text

Abstract:

This paper addresses the least-squares centralized fusion estimation problem of discrete-time random signals from measured outputs, which are perturbed by correlated noises. These measurements are obtained by different sensors, which send their information to a processing center, where the complete set of data is combined to obtain the estimators. Due to random transmission failures, some of the data packets processed for the estimation may either contain only noise (uncertain observations), be delayed (randomly delayed observations), or even be definitely lost (random packet dropouts). These multiple random transmission uncertainties are modelled by sequences of independent Bernoulli random variables with different probabilities for the different sensors. By an innovation approach and using the last observation that successfully arrived when a packet is lost, a recursive algorithm is designed for the filtering estimation problem. The proposed algorithm is easily implemented and does not require knowledge of the signal evolution model, as only the first- and second-order moments of the processes involved are used. A numerical simulation example illustrates the feasibility of the proposed estimators and shows how the probabilities of the multiple random failures influence their performance.

APA, Harvard, Vancouver, ISO, and other styles

30

Ananda-Rajah, Michelle R., Christoph Bergmeir, François Petitjean, Monica A. Slavin, Karin A. Thursky, and Geoffrey I. Webb. "Toward Electronic Surveillance of Invasive Mold Diseases in Hematology-Oncology Patients: An Expert System Combining Natural Language Processing of Chest Computed Tomography Reports, Microbiology, and Antifungal Drug Data." JCO Clinical Cancer Informatics, no. 1 (November 2017): 1–10. http://dx.doi.org/10.1200/cci.17.00011.

Full text

Abstract:

Purpose Prospective epidemiologic surveillance of invasive mold disease (IMD) in hematology patients is hampered by the absence of a reliable laboratory prompt. This study develops an expert system for electronic surveillance of IMD that combines probabilities using natural language processing (NLP) of computed tomography (CT) reports with microbiology and antifungal drug data to improve prediction of IMD. Methods Microbiology indicators and antifungal drug–dispensing data were extracted from hospital information systems at three tertiary hospitals for 123 hematology-oncology patients. Of this group, 64 case patients had 26 probable/proven IMD according to international definitions, and 59 patients were uninfected controls. Derived probabilities from NLP combined with medical expertise identified patients at high likelihood of IMD, with remaining patients processed by a machine-learning classifier trained on all available features. Results Compared with the baseline text classifier, the expert system that incorporated the best performing algorithm (naïve Bayes) improved specificity from 50.8% (95% CI, 37.5% to 64.1%) to 74.6% (95% CI, 61.6% to 85.0%), reducing false positives by 48% from 29 to 15; improved sensitivity slightly from 96.9% (95% CI, 89.2% to 99.6%) to 98.4% (95% CI, 91.6% to 100%); and improved receiver operating characteristic area from 73.9% (95% CI, 67.1% to 80.6%) to 92.8% (95% CI, 88% to 97.5%). Conclusion An expert system that uses multiple sources of data (CT reports, microbiology, antifungal drug dispensing) is a promising approach to continuous prospective surveillance of IMD in the hospital, and demonstrates reduced false notifications (positives) compared with NLP of CT reports alone. Our expert system could provide decision support for IMD surveillance, which is critical to antifungal stewardship and improving supportive care in cancer.

APA, Harvard, Vancouver, ISO, and other styles

31

Schweizer, Karl. "Probability-Based and Measurement-Related Hypotheses With Full Restriction for Investigations by Means of Confirmatory Factor Analysis." Methodology 7, no. 4 (August 1, 2011): 157–64. http://dx.doi.org/10.1027/1614-2241/a000033.

Full text

Abstract:

Probability-based and measurement-related hypotheses for confirmatory factor analysis of repeated-measures data are investigated. Such hypotheses comprise precise assumptions concerning the relationships among the true components associated with the levels of the design or the items of the measure. Measurement-related hypotheses concentrate on the assumed processes, as, for example, transformation and memory processes, and represent treatment-dependent differences in processing. In contrast, probability-based hypotheses provide the opportunity to consider probabilities as outcome predictions that summarize the effects of various influences. The prediction of performance guided by inexact cues serves as an example. In the empirical part of this paper probability-based and measurement-related hypotheses are applied to working-memory data. Latent variables according to both hypotheses contribute to a good model fit. The best model fit is achieved for the model including latent variables that represented serial cognitive processing and performance according to inexact cues in combination with a latent variable for subsidiary processes.

APA, Harvard, Vancouver, ISO, and other styles

32

Kazachkov, E. A., S. N. Matyugin, I. V. Popov, and V. V. Sharonov. "Detection and classification of small-scale objects in images obtained by synthetic-aperture radar stations." Journal of «Almaz – Antey» Air and Space Defence Corporation, no. 1 (March 30, 2018): 93–99. http://dx.doi.org/10.38013/2542-0542-2018-1-93-99.

Full text

Abstract:

The investigation deals with the problem of simultaneous detection and classification (that is, recognition) of several classes of objects in radar images by means of convolutional neural networks. We present a two-stage processing algorithm that detects and recognises objects. It also features an intermediate sub-stage that increases the resolution of those zones where objects have been detected. We show that a considerable increase in detection and recognition probabilities is possible if the recognition module is trained using high-resolution data. We implemented the detection and recognition stages using deep learning approaches for convolutional neural networks.

APA, Harvard, Vancouver, ISO, and other styles

33

Gao, Song, Min Gao, and Qin Kun Xiao. "Multisensor Tracking of a Maneuvering Target in Clutter with Proposed Parallel Updating Approach." Advanced Materials Research 383-390 (November 2011): 344–51. http://dx.doi.org/10.4028/www.scientific.net/amr.383-390.344.

Full text

Abstract:

To solve the problem of measurement original uncertainty, we present a proposed parallel updating approach for tracking a maneuvering target in cluttered environment using multiple sensors. A parallel updating method is followed where the raw sensor measurements are passed to a central processor and fed directly to the target tracker. A past approach using parallel sensor processing has ignored certain data association probabilities. Simulation results show that compared with an existing IMMPDAF algorithm with parallel sensor approach, the IMMPDAF algorithm with proposed parallel updating approach solves the problem of measurements' origins and achieves significant improvement in the accuracy of track estimation.

APA, Harvard, Vancouver, ISO, and other styles

34

Labriji, Ali, Abdelkrim Bennar, and Mostafa Rachik. "Estimation of the Conditional Probability Using a Stochastic Gradient Process." Journal of Mathematics 2021 (December 6, 2021): 1–7. http://dx.doi.org/10.1155/2021/7660113.

Full text

Abstract:

The use of conditional probabilities has gained in popularity in various fields such as medicine, finance, and imaging processing. This has occurred especially with the availability of large datasets that allow us to extract the full potential of the available estimation algorithms. Nevertheless, such a large volume of data is often accompanied by a significant need for computational capacity as well as a consequent compilation time. In this article, we propose a low-cost estimation method: we first demonstrate analytically the convergence of our method to the desired probability and then we perform a simulation to support our point.

APA, Harvard, Vancouver, ISO, and other styles

35

Wójcicki, Tomasz. "Use Of Bayesian Networks And Augmented Reality To Reliability Testing Of Complex Technical Objects." Journal of KONBiN 35, no. 1 (November 1, 2015): 179–90. http://dx.doi.org/10.1515/jok-2015-0051.

Full text

Abstract:

Abstract This paper presents a methodology developed to support the tests of reliability of complex technical objects. The presented methodology covers the use of modern information technologies in the form of algorithmic models and effective visualization techniques in the form of augmented reality. The possibility of using a probabilistic Bayesian network. The method of determining the probabilities for specific nodes, and the total probability distribution of graph structures are presented. The structure of the model and its basic functions are shown. The results of the verification work for connecting data processing methods and visualization techniques based on augmented reality are presented.

APA, Harvard, Vancouver, ISO, and other styles

36

ZHANG, WEN, TAKETOSHI YOSHIDA, and XIJIN TANG. "DISTRIBUTION OF MULTI-WORDS IN CHINESE AND ENGLISH DOCUMENTS." International Journal of Information Technology & Decision Making 08, no. 02 (June 2009): 249–65. http://dx.doi.org/10.1142/s0219622009003399.

Full text

Abstract:

As a hybrid of N-gram in natural language processing and collocation in statistical linguistics, multi-word is becoming a hot topic in area of text mining and information retrieval. In this paper, a study concerning distribution of multi-words is carried out to explore a theoretical basis for probabilistic term-weighting scheme. Specifically, the Poisson distribution, zero-inflated binomial distribution, and G-distribution are comparatively studied on a task of predicting probabilities of multi-words' occurrences using these distributions, for both technical multi-words and nontechnical multi-words. In addition, a rule-based multi-word extraction algorithm is proposed to extract multi-words from texts based on words' occurring patterns and syntactical structures. Our experimental results demonstrate that G-distribution has the best capability to predict probabilities of frequency of multi-words' occurrence and the Poisson distribution is comparable to zero-inflated binomial distribution in estimation of multi-word distribution. The outcome of this study validates that burstiness is a universal phenomenon in linguistic count data, which is applicable not only for individual content words but also for multi-words.

APA, Harvard, Vancouver, ISO, and other styles

37

Cheremisin, A., Y. Esipov, S. Traypichkin, and A. Bukreeva. "System risk assessment based on the probabilistic model “exposure-susceptibility” at the enterprises of storage and processing of vegetable agricultural products." IOP Conference Series: Earth and Environmental Science 937, no. 3 (December 1, 2021): 032073. http://dx.doi.org/10.1088/1755-1315/937/3/032073.

Full text

Abstract:

Abstract At present, elements of probabilistic safety and risk assessment have been introduced into the design and analysis of complex technical systems, one of the main disadvantages of which is the difficulty due to the selection of initial data in the form of probabilities of initiating events. As a consequence, the use of known methodologies for quantifying risk can lead to either underestimation of threats or unreasonably high security costs. On the example of an enterprise for the storage and processing of vegetable agricultural products, an approach was considered for assessing the risk of a technical system based on the probabilistic model “exposure-susceptibility”.

APA, Harvard, Vancouver, ISO, and other styles

38

Bidyuk, P. I., N. V. Kuznietsova, O. M. Trofymchuk, O. M. Terentiev, and L. B. Levenchuk. "Bayesian modeling of risks of various origin." KPI Science News, no. 4 (October 26, 2022): 7–18. http://dx.doi.org/10.20535/kpisn.2021.4.251684.

Full text

Abstract:

Background. Financial as well as many other types of risks are inherent to all types of human activities. The problem isto construct adequate mathematical description for the formal representation of risks selected and to use it for possibleloss estimation and forecasting. The loss estimation can be based upon processing available data and relevant expertestimates characterizing history and current state of the processes considered. An appropriate instrumentation for mod-elling and estimating risks of possible losses provides probabilistic approach including Bayesian techniques known todayas Bayesian programming methodology.Objective. The purpose of the paper is to perform overview of some Bayesian data processing methods providing apossibility for constructing models of financial risks selected. To use statistical data to develop a new model of Bayesiantype so that to describe formally operational risk that can occur in the information processing procedures.Methods. The methods used for data processing and model constructing refer to Bayesian programming methodology.Also Bayes theorem was directly applied to operational risk assessment in its formulation for discrete events and discreteparameters.Results. The proposed approach to modelling was applied to building a model of operational risk associated with in-correct information processing. To construct and apply the model to risk estimation the risk problem was analysed,appropriate variables were selected, and prior conditional probabilities were estimated. Functioning of the models con-structed was demonstrated with illustrative examples.Conclusions. Modelling and estimating financial and other type of risks is important practical problem that can besolved using the methodology of Bayesian programming providing the possibility for identification and taking intoconsideration uncertainties of data and expert estimates. The risk model constructed with the methodology proposedillustrates the possibilities of applying the Bayesian methods to solving the risk estimation problems

APA, Harvard, Vancouver, ISO, and other styles

39

Scholl, Victoria M., Joseph McGlinchy, Teo Price-Broncucia, Jennifer K. Balch, and Maxwell B. Joseph. "Fusion neural networks for plant classification: learning to combine RGB, hyperspectral, and lidar data." PeerJ 9 (July 29, 2021): e11790. http://dx.doi.org/10.7717/peerj.11790.

Full text

Abstract:

Airborne remote sensing offers unprecedented opportunities to efficiently monitor vegetation, but methods to delineate and classify individual plant species using the collected data are still actively being developed and improved. The Integrating Data science with Trees and Remote Sensing (IDTReeS) plant identification competition openly invited scientists to create and compare individual tree mapping methods. Participants were tasked with training taxon identification algorithms based on two sites, to then transfer their methods to a third unseen site, using field-based plant observations in combination with airborne remote sensing image data products from the National Ecological Observatory Network (NEON). These data were captured by a high resolution digital camera sensitive to red, green, blue (RGB) light, hyperspectral imaging spectrometer spanning the visible to shortwave infrared wavelengths, and lidar systems to capture the spectral and structural properties of vegetation. As participants in the IDTReeS competition, we developed a two-stage deep learning approach to integrate NEON remote sensing data from all three sensors and classify individual plant species and genera. The first stage was a convolutional neural network that generates taxon probabilities from RGB images, and the second stage was a fusion neural network that “learns” how to combine these probabilities with hyperspectral and lidar data. Our two-stage approach leverages the ability of neural networks to flexibly and automatically extract descriptive features from complex image data with high dimensionality. Our method achieved an overall classification accuracy of 0.51 based on the training set, and 0.32 based on the test set which contained data from an unseen site with unknown taxa classes. Although transferability of classification algorithms to unseen sites with unknown species and genus classes proved to be a challenging task, developing methods with openly available NEON data that will be collected in a standardized format for 30 years allows for continual improvements and major gains for members of the computational ecology community. We outline promising directions related to data preparation and processing techniques for further investigation, and provide our code to contribute to open reproducible science efforts.

APA, Harvard, Vancouver, ISO, and other styles

40

Mølgaard, B., J. Vanhatalo, P. P. Aalto, N. L. Prisle, and K. Hämeri. "Notably improved inversion of Differential Mobility Particle Sizer data obtained under conditions of fluctuating particle number concentrations." Atmospheric Measurement Techniques Discussions 8, no. 10 (October 7, 2015): 10283–317. http://dx.doi.org/10.5194/amtd-8-10283-2015.

Full text

Abstract:

Abstract. The Differential Mobility Particle Sizer (DMPS) is designed for measurements of particle number size distributions. It performs a number of measurements while scanning over different particle sizes. A standard assumption in the data processing (inversion) algorithm is that the size distribution remains the same throughout each scan. For a DMPS deployed in an urban area this assumption is likely to be violated most of the time, and the resulting size distribution data are unreliable. To improve the reliability, we developed a new algorithm using a statistical model in which the problematic assumption was replaced with more realistic smoothness assumptions, which were expressed through Gaussian Process prior probabilities. We tested the model with data from a twin-DMPS located in Helsinki and found that it provides size distribution data which are much more realistic. Furthermore, particle number concentrations extracted from the DMPS data were compared with data from a condensation particle counter at 30 s resolution, and the overall agreement was good. Thus, the quality of the inverted data was clearly improved.

APA, Harvard, Vancouver, ISO, and other styles

41

Flowers, Gwenn E., and Garry K. C. Clarke. "Surface and bed topography of Trapridge Glacier, Yukon Territory, Canada: digital elevation models and derived hydraulic geometry." Journal of Glaciology 45, no. 149 (1999): 165–74. http://dx.doi.org/10.3189/s0022143000003142.

Full text

Abstract:

Abstract Measurements of ice thickness and surface elevation are prerequisite to many glaciological investigations. A variety of techniques has been developed for interpretation of these data, including means of constructing regularly gridded digital elevation models (DEMs) for use in numerical studies. Here we present a simple yet statistically sound method for processing ice-penetrating radar data and describe a technique for interpolating these data onto a regular grid. DEMs generated for Trapridge Glacier, Yukon Territory, Canada, are used to derive geometric quantities that give preliminary insights into the underlying basin-scale hydrological system. This simple geometric analysis suggests that at low water pressures a dendritic drainage network exists that evolves into a uniaxial morphology as water pressure approaches flotation. These predictions are compared to hydraulic connection probabilities based on borehole drilling.

APA, Harvard, Vancouver, ISO, and other styles

42

Flowers, Gwenn E., and Garry K. C. Clarke. "Surface and bed topography of Trapridge Glacier, Yukon Territory, Canada: digital elevation models and derived hydraulic geometry." Journal of Glaciology 45, no. 149 (1999): 165–74. http://dx.doi.org/10.1017/s0022143000003142.

Full text

Abstract:

AbstractMeasurements of ice thickness and surface elevation are prerequisite to many glaciological investigations. A variety of techniques has been developed for interpretation of these data, including means of constructing regularly gridded digital elevation models (DEMs) for use in numerical studies. Here we present a simple yet statistically sound method for processing ice-penetrating radar data and describe a technique for interpolating these data onto a regular grid. DEMs generated for Trapridge Glacier, Yukon Territory, Canada, are used to derive geometric quantities that give preliminary insights into the underlying basin-scale hydrological system. This simple geometric analysis suggests that at low water pressures a dendritic drainage network exists that evolves into a uniaxial morphology as water pressure approaches flotation. These predictions are compared to hydraulic connection probabilities based on borehole drilling.

APA, Harvard, Vancouver, ISO, and other styles

43

Skatkov, A. V., A. A. Bryukhovetskiy, and D. V. Moiseev. "Multivariate multichannel software and measurement complex for detecting anomalous states of natural and technical objects and systems." Monitoring systems of environment, no. 2 (June 24, 2021): 119–30. http://dx.doi.org/10.33075/2220-5861-2021-2-119-130.

Full text

Abstract:

An approach to the multivariate classification of the states of natural-technical objects and systems is considered, based on the development of methods for dynamic detection of anomalies in information data flows. The approach is based on an estimate of the statistical discrepancy between the probability distributions of random variables over variably changeable time intervals, as well as an estimate of the probabilities of errors of the first and second kind. The structure of a multichannel software and measurement complex for detecting anomalous states of PTO and PTS is proposed, and the results of model calculations are presented. The use of the multivariate approach allows optimizing the processes of processing, analysis and integration of heterogeneous data, as well as increasing the sensitivity, reliability and efficiency of decisions.

APA, Harvard, Vancouver, ISO, and other styles

44

Müller, Martha-Lena, Niroshan Nadarajah, Kapil Jhalani, Inseok Heo, William Wetton, Claudia Haferlach, Torsten Haferlach, and Wolfgang Kern. "Employment of Machine Learning Models Yields Highly Accurate Hematological Disease Prediction from Raw Flow Cytometry Matrix Data without the Need for Visualization or Human Intervention." Blood 136, Supplement 1 (November 5, 2020): 11. http://dx.doi.org/10.1182/blood-2020-140927.

Full text

Abstract:

Background: Machine Learning (ML) offers automated data processing substituting various analysis steps. So far it has been applied to flow cytometry (FC) data only after visualization which may compromise data by reduction of data dimensionality. Automated analysis of FC raw matrix data has not yet been pursued. Aim: To establish as proof of concept an ML-based classifier processing FC matrix data to predict the correct lymphoma type without the need for visualization or human analysis and interpretation. Methods: A set of 6,393 uniformly analyzed samples (Navios cytometers, Kaluza software, Beckman Coulter, Miami, FL) was used for training (n=5,115) and testing (n=1,278) of different ML models. Entities were chronic lymphatic leukemia (CLL) 1103 (training) and 279 (testing), monoclonal B-cell lymphocytosis (MBL, 831/203), CLL with increased prolymphocytes (CLL-PL, 649/161), lymphoplasmacytic lymphoma (LPL, 560/159), hairy cell leukemia (HCL, 328/88), mantle cell lymphoma (MCL, 259/53), marginal zone lymphoma (MZL, 90/28), follicular lymphoma (FL, 84/16), no lymphoma (1211/291). Three tubes comprising 11 parameters per tube were applied. Besides scatter signals analyzed antigens included: CD3, CD4, CD5, CD8, CD10, CD11c, CD19, CD20, CD22, CD23, CD25, CD38, CD45, CD56, CD79b, CD103, FMC7, HLA-DR, IgM, Kappa, Lambda. Measurements generated LMD files with 50,000 rows of data for each of the 11 parameters. After removing the saturated values (≥ 1023) we produced binned histograms with 16 predefined frequency bins per parameter. Histograms were converted to cumulative distribution functions (CDF) for respective parameters and concatenated to produce a 16x11 matrix per each tube. Following the assumption of independence of parameters this simplification of concatenating CDFs represents the same information as if they were jointly distributed. The first matrix-based classifier was a decision tree model (DT), the second a deep learning model (DL) and the third was an XGBoost (XG) model, an implementation of gradient boosted decision trees ideal for structured tabular data (such as LMD files). The first set of analyses included only three classes which are readily separated by human operators: 1) CLL, 2) HCL, 3) no lymphoma. The second set included all nine entities but grouped into four classes: 1) CD5+ lymphoma (CLL, MBL, CLL-PL, MCL), 2) HCL, 3) other CD5- lymphoma (LPL, MZL, FL), 4) no lymphoma. The third set included each of the nine entities as its own class. Results: Analyzing the three classes from the first set (CLL, HCL, no lymphoma) the models achieved accuracies of 94% (DT), 95% (DL) and 96% (XG) when including all cases. By analysis of cases with prediction probabilities above 90%, DT now reached 97%, DL 97% and XG 98% accuracy, whilst losing 38%, 8% and 6% of samples, respectively. We further observed that accuracy was also dependent on the size of the pathologic clone, which is in line with the experiences from human experts with very small clones (≤ 0.1% of leukocytes) representing a major challenge regarding their correct classification. Focusing on cases with clones > 0.1% but considering all prediction probabilities accuracies were 96% (DT), 97% (DL) and 98% (XG), with loss of 5% of samples for each model. Considering cases only with prediction probabilities > 90% and clones > 0.1% accuracies were 97% (DT), 99% (DL) and 99% (XG) whilst losing 38%, 9% and 9% of samples, respectively. Further analyses were performed applying the best model based on results above, i.e. XG. Analyzing four classes in the second set of analyses (CD5+ lymphoma, HCL, other CD5- lymphoma, no lymphoma) and considering cases only with prediction probabilities > 95% and clones > 0.1% accuracy was 96% while losing 28% of samples. In the third set of analyses with each entity assigned its own class and again considering cases only with prediction probabilities > 95% and clones > 0.1% accuracy was 93% while losing 28% of samples. Conclusions: This first ML-based classifier using the XGboost model with transforming FC matrix data to concatenated distributions, is capable of correctly assigning the vast majority of lymphoma samples analyzing FC raw data without visualization or human interpretation. Cases that need further attention by human experts will be flagged but will not account for more than 30% of all cases. This data will be extended in a prospective blinded study (clinicaltrials.gov NCT4466059). Disclosures Heo: AWS: Current Employment. Wetton:AWS: Current Employment.

APA, Harvard, Vancouver, ISO, and other styles

45

Gómez, David M., Peggy Mok, Mikhail Ordin, Jacques Mehler, and Marina Nespor. "Statistical Speech Segmentation in Tone Languages: The Role of Lexical Tones." Language and Speech 61, no. 1 (May 9, 2017): 84–96. http://dx.doi.org/10.1177/0023830917706529.

Full text

Abstract:

Research has demonstrated distinct roles for consonants and vowels in speech processing. For example, consonants have been shown to support lexical processes, such as the segmentation of speech based on transitional probabilities (TPs), more effectively than vowels. Theory and data so far, however, have considered only non-tone languages, that is to say, languages that lack contrastive lexical tones. In the present work, we provide a first investigation of the role of consonants and vowels in statistical speech segmentation by native speakers of Cantonese, as well as assessing how tones modulate the processing of vowels. Results show that Cantonese speakers are unable to use statistical cues carried by consonants for segmentation, but they can use cues carried by vowels. This difference becomes more evident when considering tone-bearing vowels. Additional data from speakers of Russian and Mandarin suggest that the ability of Cantonese speakers to segment streams with statistical cues carried by tone-bearing vowels extends to other tone languages, but is much reduced in speakers of non-tone languages.

APA, Harvard, Vancouver, ISO, and other styles

46

Yusup, Muhamad, Romzi Syauqi Naufal, and Marviola Hardini. "Management of Utilizing Data Analysis and Hypothesis Testing in Improving the Quality of Research Reports." Aptisi Transactions on Management (ATM) 2, no. 2 (January 25, 2019): 159–67. http://dx.doi.org/10.33050/atm.v2i2.789.

Full text

Abstract:

Data analysis and mathematical techniques play a central role in quantitative data processing. Quantitative researchers estimate (strength) the strength of the relationship of variables, and test hypotheses statistically. Unlike the case with qualitative research. Although qualitative researchers might test a hypothesis in the analysis process, they do not estimate or test hypotheses about the relationship of variables statistically. Through tests or statistical tests can be used as the main means for interpreting the results of research data. It is through this statistical test that we as researchers can compare which data groups and what can be used to determine probabilities or possibilities that distinguish between groups based on an opportunity. Thus, it can provide evidence to determine the validity of a hypothesis or conclusion. In this study, we will discuss the preparation of data for analysis such as editing data, coding, categorizing, and entering data. As well as discussing the differences in data analysis for descriptive statistics and inferential statistics, differences in data analysis for parametric and non-parametric statistics in research, explanations of multivariate data analysis procedures, and also forms of research hypotheses.

APA, Harvard, Vancouver, ISO, and other styles

47

Mulyani, Asri, Dede Kurniadi, Muhammad Rikza Nashrulloh, Indri Tri Julianto, and Meta Regita. "THE PREDICTION OF PPA AND KIP-KULIAH SCHOLARSHIP RECIPIENTS USING NAIVE BAYES ALGORITHM." Jurnal Teknik Informatika (Jutif) 3, no. 4 (August 20, 2022): 821–27. http://dx.doi.org/10.20884/1.jutif.2022.3.4.297.

Full text

Abstract:

The aim of the research is was to predict the scholar recipient for Peningkatan Prestasi Akademik (PPA) and the Kartu Indonesia Pintar Kuliah (KIP-K). The prediction results of scholarship recipients will provide information in the form of the possibility of acceptance and non-acceptance of scholarship applicants. To achieve this goal, this study uses the Naive Bayes algorithm, where this algorithm predicts future opportunities based on past data by going through the stages of reading training data, then calculating the number of probabilities and classifying the values in the mean and probability table. The data analysis includes data collection, data processing, model implementation, and evaluation. The data needed for analysis needs to use data from the applicants for Academic Achievement Improvement (PPA) scholarship and the Indonesia Smart Education Card (KIP-K) scholarship. The data used for training data were 145 student data. The results of the study using the Naive Bayes algorithm have an accuracy of 80% for PPA scholarships and 91% for KIP-K scholarships.

APA, Harvard, Vancouver, ISO, and other styles

48

GANCHEV, TODOR. "ENHANCED TRAINING FOR THE LOCALLY RECURRENT PROBABILISTIC NEURAL NETWORKS." International Journal on Artificial Intelligence Tools 18, no. 06 (December 2009): 853–81. http://dx.doi.org/10.1142/s0218213009000433.

Full text

Abstract:

In the present contribution we propose an integral training procedure for the Locally Recurrent Probabilistic Neural Networks (LR PNNs). Specifically, the adjustment of the smoothing factor "sigma" in the pattern layer of the LR PNN and the training of the recurrent layer weights are integrated in an automatic process that iteratively estimates all adjustable parameters of the LR PNN from the available training data. Furthermore, in contrast to the original LR PNN, whose recurrent layer was trained to provide optimum separation among the classes on the training dataset, while striving to keep a balance between the learning rates for all classes, here the training strategy is oriented towards optimizing the overall classification accuracy, straightforwardly. More precisely, the new training strategy directly targets at maximizing the posterior probabilities for the target class and minimizing the posterior probabilities estimated for the non-target classes. The new fitness function requires fewer computations for each evaluation, and therefore the overall computational demands for training the recurrent layer weights are reduced. The performance of the integrated training procedure is illustrated on three different speech processing tasks: emotion recognition, speaker identification and speaker verification.

APA, Harvard, Vancouver, ISO, and other styles

49

Küttenbaum, Stefan, Stefan Maack, Alexander Taffe, and Thomas Braml. "On the treatment of measurement uncertainty in stochastic modeling of basic variables." Acta Polytechnica CTU Proceedings 36 (August 18, 2022): 109–18. http://dx.doi.org/10.14311/app.2022.36.0109.

Full text

Abstract:

The acquisition and appropriate processing of relevant information about the considered system remains a major challenge in assessment of existing structures. Both the values and the validity of computed results such as failure probabilities essentially depend on the quantity and quality of the incorporated knowledge. One source of information are onsite measurements of structural or material characteristics to be modeled as basic variables in reliability assessment. The explicit use of (quantitative) measurement results in assessment requires the quantification of the quality of the measured information, i.e., the uncertainty associated with the information acquisition and processing. This uncertainty can be referred to as measurement uncertainty. Another crucial aspect is to ensure the comparability of the measurement results.This contribution attempts to outline the necessity and the advantages of measurement uncertainty calculations in modeling of measurement data-based random variables to be included in reliability assessment. It is shown, how measured data representing time-invariant characteristics, in this case non-destructively measured inner geometrical dimensions, can be transferred into measurement results that are both comparable and quality-evaluated. The calculations are based on the rules provided in the guide to the expression of uncertainty in measurement (GUM). The GUM-framework is internationally accepted in metrology and can serve as starting point for the appropriate processing of measured data to be used in assessment. In conclusion, the effects of incorporating the non-destructively measured data into reliability analysis are presented using a prestressed concrete bridge as case-study.

APA, Harvard, Vancouver, ISO, and other styles

50

Kim, Si Gwan. "Reliable Cluster-Based Routing Algorithms in Wireless Sensor Networks." Applied Mechanics and Materials 284-287 (January 2013): 2147–51. http://dx.doi.org/10.4028/www.scientific.net/amm.284-287.2147.

Full text

Abstract:

With advanced micro-electromechanical technology, the development of small-size, low-cost, and low-power sensors that possess sensing, signal processing and wireless communication capabilities is becoming popular than ever. To achieve the energy efficiency in the wireless sensor networks, LEACH has been proposed as a routing protocol, composing of a few clusters, each of which consists of member nodes that sense the data, and head nodes that deliver the collected data from member nodes to a sink node. When wireless link error occurs, LEACH may miss some messages because of only one cluster head. As our proposed scheme manages two cluster heads for each cluster, there should be higher probabilities for messages to reach the sink node. Simulation results show that our proposed algorithm provides more robust than LEACH when wireless link error occurs.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!