Academic literature on the topic 'STATISTICAL FEATURE RANKING'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'STATISTICAL FEATURE RANKING.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "STATISTICAL FEATURE RANKING"

1

MANSOORI, EGHBAL G. "USING STATISTICAL MEASURES FOR FEATURE RANKING." International Journal of Pattern Recognition and Artificial Intelligence 27, no. 01 (February 2013): 1350003. http://dx.doi.org/10.1142/s0218001413500031.

Full text
Abstract:
Feature ranking is a fundamental preprocess for feature selection, before performing any data mining task. Essentially, when there are too many features in the problem, dimensionality reduction through discarding weak features is highly desirable. In this paper, we have developed an efficient feature ranking algorithm for selecting the more relevant features prior to derivation of classification predictors. Regardless the ranking criteria which rely on the training error of a predictor based on a feature, our approach is distance-based, employing only the statistical distribution of classes in each feature. It uses a scoring function as ranking criterion to evaluate the correlation measure between each feature and the classes. This function comprises three measures for each class: the statistical between-class distance, the interclass overlapping measure, and an estimate of class impurity. In order to compute the statistical parameters, used in these measures, a normalized form of histogram, obtained for each class, is employed as its a priori probability density. Since the proposed algorithm examines each feature individually, it provides a fast and cost-effective method for feature ranking. We have tested the effectiveness of our approach on some benchmark data sets with high dimensions. For this purpose, some top-ranked features are selected and are used in some rule-based classifiers as the target data mining task. Comparing with some popular feature ranking methods, the experimental results show that our approach has better performance as it can identify the more relevant features eventuate to lower classification error.
APA, Harvard, Vancouver, ISO, and other styles
2

Naim, Faradila, Mahfuzah Mustafa, Norizam Sulaiman, and Zarith Liyana Zahari. "Dual-Layer Ranking Feature Selection Method Based on Statistical Formula for Driver Fatigue Detection of EMG Signals." Traitement du Signal 39, no. 3 (June 30, 2022): 1079–88. http://dx.doi.org/10.18280/ts.390335.

Full text
Abstract:
Electromyography (EMG) signals are one of the most studied inputs for driver drowsiness detection systems. As the number of EMG features available can be daunting, finding the most significant and minimal subset features is desirable. Hence, a simplified feature selection method is necessary. This work proposed a dual-layer ranking feature selection algorithm based on statistical formula f EMG signals for driver fatigue detection. In the beginning, in the first layer, 21 filter algorithms were calculated to rank 47 sets of EMG features (25 time-domain and 9 frequency-domain) and applied to six classifiers. Then, in the second layer, all the ranks were re-ranked based on the statistical formula (average, median, mode and variance). The classification performance of all rankings was compared along with the number of features. The highest classification accuracy achieved was 95% for 12 features using the Average Statistical Rank (ASR) and LDA classifier. It is conclusive that a combination of features from the time domain and frequency domain can deliver better performance compared to a single domain feature. Concurrently, the statistical rank ASR performed better than the single filter rank by reducing the number of features. The proposed model can be a benchmark for the enhanced feature selection method for EMG driver fatigue signal.
APA, Harvard, Vancouver, ISO, and other styles
3

Soheili, Majid, Amir-Masoud Eftekhari Moghadam, and Mehdi Dehghan. "Statistical Analysis of the Performance of Rank Fusion Methods Applied to a Homogeneous Ensemble Feature Ranking." Scientific Programming 2020 (September 10, 2020): 1–14. http://dx.doi.org/10.1155/2020/8860044.

Full text
Abstract:
The feature ranking as a subcategory of the feature selection is an essential preprocessing technique that ranks all features of a dataset such that many important features denote a lot of information. The ensemble learning has two advantages. First, it has been based on the assumption that combining different model’s output can lead to a better outcome than the output of any individual models. Second, scalability is an intrinsic characteristic that is so crucial in coping with a large scale dataset. In this paper, a homogeneous ensemble feature ranking algorithm is considered, and the nine rank fusion methods used in this algorithm are analyzed comparatively. The experimental studies are performed on real six medium datasets, and the area under the feature-forward-addition curve criterion is assessed. Finally, the statistical analysis by repeated-measures analysis of variance results reveals that there is no big difference in the performance of the rank fusion methods applied in a homogeneous ensemble feature ranking; however, this difference is a statistical significance, and the B-Min method has a little better performance.
APA, Harvard, Vancouver, ISO, and other styles
4

Mogstad, Magne, Joseph Romano, Azeem Shaikh, and Daniel Wilhelm. "Statistical Uncertainty in the Ranking of Journals and Universities." AEA Papers and Proceedings 112 (May 1, 2022): 630–34. http://dx.doi.org/10.1257/pandp.20221064.

Full text
Abstract:
Economists are obsessed with rankings of institutions, journals, or scholars according to the value of some feature of interest. These rankings are invariably computed using estimates rather than the true values of such features. As a result, there may be considerable uncertainty concerning the ranks. In this paper, we consider the problem of accounting for such uncertainty by constructing confidence sets for the ranks. We consider the problem of constructing marginal confidence sets for the rank of, say, a particular journal as well as simultaneous confidence sets for the ranks of all journals.
APA, Harvard, Vancouver, ISO, and other styles
5

Zhang, Zhicheng, Xiaokun Liang, Wenjian Qin, Shaode Yu, and Yaoqin Xie. "matFR: a MATLAB toolbox for feature ranking." Bioinformatics 36, no. 19 (July 8, 2020): 4968–69. http://dx.doi.org/10.1093/bioinformatics/btaa621.

Full text
Abstract:
Abstract Summary Nowadays, it is feasible to collect massive features for quantitative representation and precision medicine, and thus, automatic ranking to figure out the most informative and discriminative ones becomes increasingly important. To address this issue, 42 feature ranking (FR) methods are integrated to form a MATLAB toolbox (matFR). The methods apply mutual information, statistical analysis, structure clustering and other principles to estimate the relative importance of features in specific measure spaces. Specifically, these methods are summarized, and an example shows how to apply a FR method to sort mammographic breast lesion features. The toolbox is easy to use and flexible to integrate additional methods. Importantly, it provides a tool to compare, investigate and interpret the features selected for various applications. Availability and implementation The toolbox is freely available at http://github.com/NicoYuCN/matFR. A tutorial and an example with a dataset are provided.
APA, Harvard, Vancouver, ISO, and other styles
6

SADEGHI, SABEREH, and HAMID BEIGY. "A NEW ENSEMBLE METHOD FOR FEATURE RANKING IN TEXT MINING." International Journal on Artificial Intelligence Tools 22, no. 03 (June 2013): 1350010. http://dx.doi.org/10.1142/s0218213013500103.

Full text
Abstract:
Dimensionality reduction is a necessary task in data mining when working with high dimensional data. A type of dimensionality reduction is feature selection. Feature selection based on feature ranking has received much attention by researchers. The major reasons are its scalability, ease of use, and fast computation. Feature ranking methods can be divided into different categories and may use different measures for ranking features. Recently, ensemble methods have entered in the field of ranking and achieved more accuracy among others. Accordingly, in this paper a Heterogeneous ensemble based algorithm for feature ranking is proposed. The base ranking methods in this ensemble structure are chosen from different categories like information theoretic, distance based, and statistical methods. The results of the base ranking methods are then fused into a final feature subset by means of genetic algorithm. The diversity of the base methods improves the quality of initial population of the genetic algorithm and thus reducing the convergence time of the genetic algorithm. In most of ranking methods, it's the user's task to determine the threshold for choosing the appropriate subset of features. It is a problem, which may cause the user to try many different values to select a good one. In the proposed algorithm, the difficulty of determining a proper threshold by the user is decreased. The performance of the algorithm is evaluated on four different text datasets and the experimental results show that the proposed method outperforms all other five feature ranking methods used for comparison. One advantage of the proposed method is that it is independent to the classification method used for classification.
APA, Harvard, Vancouver, ISO, and other styles
7

Novakovic, Jasmina, Perica Strbac, and Dusan Bulatovic. "Toward optimal feature selection using ranking methods and classification algorithms." Yugoslav Journal of Operations Research 21, no. 1 (2011): 119–35. http://dx.doi.org/10.2298/yjor1101119n.

Full text
Abstract:
We presented a comparison between several feature ranking methods used on two real datasets. We considered six ranking methods that can be divided into two broad categories: statistical and entropy-based. Four supervised learning algorithms are adopted to build models, namely, IB1, Naive Bayes, C4.5 decision tree and the RBF network. We showed that the selection of ranking methods could be important for classification accuracy. In our experiments, ranking methods with different supervised learning algorithms give quite different results for balanced accuracy. Our cases confirm that, in order to be sure that a subset of features giving the highest accuracy has been selected, the use of many different indices is recommended.
APA, Harvard, Vancouver, ISO, and other styles
8

Leguia, Marc G., Zoran Levnajić, Ljupčo Todorovski, and Bernard Ženko. "Reconstructing dynamical networks via feature ranking." Chaos: An Interdisciplinary Journal of Nonlinear Science 29, no. 9 (September 2019): 093107. http://dx.doi.org/10.1063/1.5092170.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, W., P. Jones, and D. Partridge. "A Comparative Study of Feature-Salience Ranking Techniques." Neural Computation 13, no. 7 (July 1, 2001): 1603–23. http://dx.doi.org/10.1162/089976601750265027.

Full text
Abstract:
We assess the relative merits of a number of techniques designed to determine the relative salience of the elements of a feature set with respect to their ability to predict a category outcome-for example, which features of a character contribute most to accurate character recognition. A number of different neural-net-based techniques have been proposed (by us and others) in addition to a standard statistical technique, and we add a technique based on inductively generated decision trees. The salience of the features that compose a proposed set is an important problem to solve efficiently and effectively, not only for neural computing technology but also in order to provide a sound basis for any attempt to design an optimal computational system. The focus of this study is the efficiency and the effectiveness with which high-salience subsets of features can be identified in the context of ill-understood and potentially noisy real-world data. Our two simple approaches, weight clamping using a neural network and feature ranking using a decision tree, generally provide a good, consistent ordering of features. In addition, linear correlation often works well.
APA, Harvard, Vancouver, ISO, and other styles
10

Werner, Tino. "A review on instance ranking problems in statistical learning." Machine Learning 111, no. 2 (November 18, 2021): 415–63. http://dx.doi.org/10.1007/s10994-021-06122-3.

Full text
Abstract:
AbstractRanking problems, also known as preference learning problems, define a widely spread class of statistical learning problems with many applications, including fraud detection, document ranking, medicine, chemistry, credit risk screening, image ranking or media memorability. While there already exist reviews concentrating on specific types of ranking problems like label and object ranking problems, there does not yet seem to exist an overview concentrating on instance ranking problems that both includes developments in distinguishing between different types of instance ranking problems as well as careful discussions about their differences and the applicability of the existing ranking algorithms to them. In instance ranking, one explicitly takes the responses into account with the goal to infer a scoring function which directly maps feature vectors to real-valued ranking scores, in contrast to object ranking problems where the ranks are given as preference information with the goal to learn a permutation. In this article, we systematically review different types of instance ranking problems and the corresponding loss functions resp. goodness criteria. We discuss the difficulties when trying to optimize those criteria. As for a detailed and comprehensive overview of existing machine learning techniques to solve such ranking problems, we systematize existing techniques and recapitulate the corresponding optimization problems in a unified notation. We also discuss to which of the instance ranking problems the respective algorithms are tailored and identify their strengths and limitations. Computational aspects and open research problems are also considered.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "STATISTICAL FEATURE RANKING"

1

Luong, Ngoc Quang. "Word Confidence Estimation and Its Applications in Statistical Machine Translation." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM051/document.

Full text
Abstract:
Les systèmes de traduction automatique (TA), qui génèrent automatiquement la phrase de la langue cible pour chaque entrée de la langue source, ont obtenu plusieurs réalisations convaincantes pendant les dernières décennies et deviennent les aides linguistiques efficaces pour la communauté entière dans un monde globalisé. Néanmoins, en raison de différents facteurs, sa qualité en général est encore loin de la perfection, constituant le désir des utilisateurs de savoir le niveau de confiance qu'ils peuvent mettre sur une traduction spécifique. La construction d'une méthode qui est capable d'indiquer des bonnes parties ainsi que d'identifier des erreurs de la traduction est absolument une bénéfice pour non seulement les utilisateurs, mais aussi les traducteurs, post-éditeurs, et les systèmes de TA eux-mêmes. Nous appelons cette méthode les mesures de confiance (MC). Cette thèse se porte principalement sur les méthodes des MC au niveau des mots (MCM). Le système de MCM assigne à chaque mot de la phrase cible un étiquette de qualité. Aujourd'hui, les MCM jouent un rôle croissant dans nombreux aspects de TA. Tout d'abord, elles aident les post-éditeurs d'identifier rapidement les erreurs dans la traduction et donc d'améliorer leur productivité de travail. De plus, elles informent les lecteurs des portions qui ne sont pas fiables pour éviter leur malentendu sur le contenu de la phrase. Troisièmement, elles sélectionnent la meilleure traduction parmi les sorties de plusieurs systèmes de TA. Finalement, et ce qui n'est pas le moins important, les scores MCM peuvent aider à perfectionner la qualité de TA via certains scénarios: ré-ordonnance des listes N-best, ré-décodage du graphique de la recherche, etc. Dans cette thèse, nous visons à renforcer et optimiser notre système de MCM, puis à l'exploiter pour améliorer TA ainsi que les mesures de confiance au niveau des phrases (MCP). Comparer avec les approches précédentes, nos nouvelles contributions étalent sur les points principaux comme suivants. Tout d'abord, nous intégrons différents types des paramètres: ceux qui sont extraits du système TA, avec des caractéristiques lexicales, syntaxiques et sémantiques pour construire le système MCM de base. L'application de différents méthodes d'apprentissage nous permet d'identifier la meilleure (méthode: "Champs conditionnels aléatoires") qui convient le plus nos donnés. En suite, l'efficacité de touts les paramètres est plus profond examinée en utilisant un algorithme heuristique de sélection des paramètres. Troisièmement, nous exploitons l'algorithme Boosting comme notre méthode d'apprentissage afin de renforcer la contribution des sous-ensembles des paramètres dominants du système MCM, et en conséquence d'améliorer la capacité de prédiction du système MCM. En outre, nous enquérons les contributions des MCM vers l'amélioration de la qualité de TA via différents scénarios. Dans le re-ordonnance des liste N-best, nous synthétisons les scores à partir des sorties du système MCM et puis les intégrons avec les autres scores du décodeur afin de recalculer la valeur de la fonction objective, qui nous permet d'obtenir un mieux candidat. D'ailleurs, dans le ré-décodage du graphique de la recherche, nous appliquons des scores de MCM directement aux noeuds contenant chaque mot pour mettre à jour leurs coûts. Une fois la mise à jour se termine, la recherche pour meilleur chemin sur le nouveau graphique nous donne la nouvelle hypothèse de TA. Finalement, les scores de MCM sont aussi utilisés pour renforcer les performances des systèmes de MCP. Au total, notre travail apporte une image perspicace et multidimensionnelle sur des MCM et leurs impacts positifs sur différents secteurs de la TA. Les résultats très prometteurs ouvrent une grande avenue où MCM peuvent exprimer leur rôle, comme: MCM pour la reconnaissance automatique de la parole (RAP), pour la sélection parmi plusieurs systèmes de TA, et pour les systèmes de TA auto-apprentissage
Machine Translation (MT) systems, which generate automatically the translation of a target language for each source sentence, have achieved impressive gains during the recent decades and are now becoming the effective language assistances for the entire community in a globalized world. Nonetheless, due to various factors, MT quality is still not perfect in general, and the end users therefore expect to know how much should they trust a specific translation. Building a method that is capable of pointing out the correct parts, detecting the translation errors and concluding the overall quality of each MT hypothesis is definitely beneficial for not only the end users, but also for the translators, post-editors, and MT systems themselves. Such method is widely known under the name Confidence Estimation (CE) or Quality Estimation (QE). The motivations of building such automatic estimation methods originate from the actual drawbacks of assessing manually the MT quality: this task is time consuming, effort costly, and sometimes impossible in case where the readers have little or no knowledge of the source language. This thesis mostly focuses on the CE methods at word level (WCE). The WCE classifier tags each word in the MT output a quality label. The WCE working mechanism is straightforward: a classifier trained beforehand by a number of features using ML methods computes the confidence score of each label for each MT output word, then tag this word with highest score label. Nowadays, WCE shows an increasing importance in many aspects of MT. Firstly, it assists the post-editors to quickly identify the translation errors, hence improve their productivity. Secondly, it informs readers of portions of sentence that are not reliable to avoid the misunderstanding about the sentence's content. Thirdly, it selects the best translation among options from multiple MT systems. Last but not least, WCE scores can help to improve the MT quality via some scenarios: N-best list re-ranking, Search Graph Re-decoding, etc. In this thesis, we aim at building and optimizing our baseline WCE system, then exploiting it to improve MT and Sentence Confidence Estimation (SCE). Compare to the previous approaches, our novel contributions spread of these following main points. Firstly, we integrate various types of prediction indicators: system-based features extracted from the MT system, together with lexical, syntactic and semantic features to build the baseline WCE systems. We also apply multiple Machine Learning (ML) models on the entire feature set and then compare their performances to select the optimal one to optimize. Secondly, the usefulness of all features is deeper investigated using a greedy feature selection algorithm. Thirdly, we propose a solution that exploits Boosting algorithm as a learning method in order to strengthen the contribution of dominant feature subsets to the system, thus improve of the system's prediction capability. Lastly, we explore the contributions of WCE in improving MT quality via some scenarios. In N-best list re-ranking, we synthesize scores from WCE outputs and integrate them with decoder scores to calculate again the objective function value, then to re-order the N-best list to choose a better candidate. In the decoder's search graph re-decoding, the proposition is to apply WCE score directly to the nodes containing each word to update its cost regarding on the word quality. Furthermore, WCE scores are used to build useful features, which can enhance the performance of the Sentence Confidence Estimation system. In total, our work brings the insightful and multidimensional picture of word quality prediction and its positive impact on various sectors for Machine Translation. The promising results open up a big avenue where WCE can play its role, such as WCE for Automatic Speech Recognition (ASR) System, WCE for multiple MT selection, and WCE for re-trainable and self-learning MT systems
APA, Harvard, Vancouver, ISO, and other styles
2

Peel, Thomas. "Algorithmes de poursuite stochastiques et inégalités de concentration empiriques pour l'apprentissage statistique." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4769/document.

Full text
Abstract:
La première partie de cette thèse introduit de nouveaux algorithmes de décomposition parcimonieuse de signaux. Basés sur Matching Pursuit (MP) ils répondent au problème suivant : comment réduire le temps de calcul de l'étape de sélection de MP, souvent très coûteuse. En réponse, nous sous-échantillonnons le dictionnaire à chaque itération, en lignes et en colonnes. Nous montrons que cette approche fondée théoriquement affiche de bons résultats en pratique. Nous proposons ensuite un algorithme itératif de descente de gradient par blocs de coordonnées pour sélectionner des caractéristiques en classification multi-classes. Celui-ci s'appuie sur l'utilisation de codes correcteurs d'erreurs transformant le problème en un problème de représentation parcimonieuse simultanée de signaux. La deuxième partie expose de nouvelles inégalités de concentration empiriques de type Bernstein. En premier, elles concernent la théorie des U-statistiques et sont utilisées pour élaborer des bornes en généralisation dans le cadre d'algorithmes de ranking. Ces bornes tirent parti d'un estimateur de variance pour lequel nous proposons un algorithme de calcul efficace. Ensuite, nous présentons une version empirique de l'inégalité de type Bernstein proposée par Freedman [1975] pour les martingales. Ici encore, la force de notre borne réside dans l'introduction d'un estimateur de variance calculable à partir des données. Cela nous permet de proposer des bornes en généralisation pour l'ensemble des algorithmes d'apprentissage en ligne améliorant l'état de l'art et ouvrant la porte à une nouvelle famille d'algorithmes d'apprentissage tirant parti de cette information empirique
The first part of this thesis introduces new algorithms for the sparse encoding of signals. Based on Matching Pursuit (MP) they focus on the following problem : how to reduce the computation time of the selection step of MP. As an answer, we sub-sample the dictionary in line and column at each iteration. We show that this theoretically grounded approach has good empirical performances. We then propose a bloc coordinate gradient descent algorithm for feature selection problems in the multiclass classification setting. Thanks to the use of error-correcting output codes, this task can be seen as a simultaneous sparse encoding of signals problem. The second part exposes new empirical Bernstein inequalities. Firstly, they concern the theory of the U-Statistics and are applied in order to design generalization bounds for ranking algorithms. These bounds take advantage of a variance estimator and we propose an efficient algorithm to compute it. Then, we present an empirical version of the Bernstein type inequality for martingales by Freedman [1975]. Again, the strength of our result lies in the variance estimator computable from the data. This allows us to propose generalization bounds for online learning algorithms which improve the state of the art and pave the way to a new family of learning algorithms taking advantage of this empirical information
APA, Harvard, Vancouver, ISO, and other styles
3

KUMAR, AKHIL. "MACHINE LEARNING BASED INTRUSION DETECTION SYSTEM USING STATISTICAL FEATURE RANKING METHOD." Thesis, 2023. http://dspace.dtu.ac.in:8080/jspui/handle/repository/19866.

Full text
Abstract:
Big data has made it easier for people to live an information-based Internet lifestyle, but it has also created a number of serious network security issues that make it difficult to use networks on a regular basis. At the moment, intrusion detection systems are mostly used to identify aberrant network traffic. To keep track of packets entering a network, an IDS employs sensors. To find malicious packets, the packet data with the attack signatures it has stored in memory, and then compare the results. Another sort of IDS analyses the patterns of the monitored packets to spot packets that are attempting to attack the network. These IDSs are believed to be able to identify new sorts of assaults and detect packet irregularities. Both varieties of IDSs provide reports of malicious activities at the management console. An IDS offers an automated system to find both internal and external intruders. Firewalls are used to show and/or restrict the ports and IP addresses used for communication between two entities, whereas IDS are able to inspect the content of the packets before acting.The actual process of the current traffic incursion detection systems needs to be changed, nonetheless, due to their numerous flaws and high resource occupation rate. So, utilising a Machine Learning (ML) technique, we suggested a statistical analysis-based intrusion detection system in this study. In this paper, we suggested a mechanism for detecting intrusions by applying the T test, a statistical tool for ranking analysis: two sample assuming unequal variances. A substantial amount of network traffic data that includes both malware data and normal traffic data is gathered in order to identify the pattern of the malware data. The t-test is used to score nine different traffic aspects for 2 both intrusion and regular traffic, resulting in nine "t" values from which other features were deduced. The Naive Bayes machine learning algorithm will then be applied to the 9 features, deleting one feature at a time that has the lowest “t” value to provide 9 alternative accuracy values. After examining the accuracy value, we get to the conclusion that the two features with the lowest value are removed in order to attain the highest accuracy, with accuracy of the data increasing as each of those two lower features are removed. The accuracy percentage of our work is 95.69% achieved on top 7 features rather than using all 9 features. Hence, we can argue that feature ranking using T-test helps us in improving the overall detection accuracy.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "STATISTICAL FEATURE RANKING"

1

Kamarainen, J. K., J. Ilonen, P. Paalanen, M. Hamouz, H. Kälviäinen, and J. Kittler. "Object Evidence Extraction Using Simple Gabor Features and Statistical Ranking." In Image Analysis, 119–29. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11499145_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Xueyan, Yixuan Zhang, Ye Yang, Chengcheng Deng, and Jun Yang. "Uncertainty Analysis and Sensitivity Evaluation of a Main Steam Line Break Accident on an Advanced PWR." In Springer Proceedings in Physics, 327–41. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-1023-6_30.

Full text
Abstract:
AbstractA RELAP5 input model was established for a scaled-up facility simulating China's Advanced Passive Water Reactor with passive safety features. The simulation was performed to reproduce a Main Steam Line Break (MSLB) scenario at steam line connected to one Steam Generator. The figure of merit selected in this accident scenario includes the maximum containment pressure, mass and energy release to containment. Driving factors of this response function include Passive Residue Heat Removal material thermal conductivity, Pressurizer temperature, and broken steam line temperature.To achieve an adequately justified safety margin using a Best Estimate Plus Uncertainty analysis, dominant phenomena were selected from a reference Phenomenon Identification and Ranking Table. The calculation results were compared with the available reference data of similar Generation III Passive Water Reactor to assess the code's capability to predict the MSLB phenomena. The DAKOTA toolkit is used to drive both parameter sensitivity analysis and uncertainty propagation. The 95/95 uncertainty bands of key output parameters were obtained using the Wilks’ statistical methods.Compared with the reference data, the simulation results partially confirmed the stability and repeatability of the code model for initial and boundary condition perturbations. The uncertainty bands of important output parameters were demonstrated. The results indicated that the maximum containment pressure value was below the safety limit, and the passive safety system can mitigate the consequence of the MSLB. The mass and energy released into the containment were assessed according to the containment design.The parameter sensitivity analysis was performed with 34 input parameters, and the results were evaluated by Spearman's Simple Rank Correlation Coefficients.
APA, Harvard, Vancouver, ISO, and other styles
3

Érdi, Péter. "Choices, games, laws, and the Web." In Ranking, 65–98. Oxford University Press, 2019. http://dx.doi.org/10.1093/oso/9780190935467.003.0004.

Full text
Abstract:
This chapter starts with the notions of “objective reality” and “subjective reality.” Objectivity attempts to represent the outside world without bias or presuppositions, while subjectivity results from personal cognition or preferences. The chapter discusses the mechanisms by which people make choices. Research conducted in the last 60 years has resulted in a shift in our understanding of human decision making from the concept of rationality to a new model that acknowledges the role of cognitive biases. Individual choices and preferences are aggregated to form social preferences, and this chapter reviews some techniques behind this aggregation. It also explains that preference ranking does not always imply a unique result because it is possible to get a cyclic pathway, as in the rock, paper, scissors game. Elements of this game appeared in both ancient religious systems and in the US governmental system. Then the chapter turns to the famous PageRank algorithm, which made Google what it is today. The algorithm is able to produce a relevant ranking of websites within a very reasonable time. The algorithm could produce different results, and rank reversal may happen in real-world situations. Ranking many elements based on some characteristic features, such as words based on the frequency of their occurrence, can use statistical methods. In many real cases, the distribution of these features strongly deviates from the bell curve, and models instead a skew distribution, technically called a power law distribution.
APA, Harvard, Vancouver, ISO, and other styles
4

Kastrati, Zenun, Ali Shariq Imran, and Sule Yildirim Yayilgan. "A Hybrid Concept Learning Approach to Ontology Enrichment." In Innovations, Developments, and Applications of Semantic Web and Information Systems, 85–119. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5042-6.ch004.

Full text
Abstract:
The wide use of ontology in different applications has resulted in a plethora of automatic approaches for population and enrichment of an ontology. Ontology enrichment is an iterative process where the existing ontology is continuously updated with new concepts. A key aspect in ontology enrichment process is the concept learning approach. A learning approach can be a linguistic-based, statistical-based, or hybrid-based that employs both linguistic as well as statistical-based learning approaches. This chapter presents a concept enrichment model that combines contextual and semantic information of terms. The proposed model called SEMCON employs a hybrid concept learning approach utilizing functionalities from statistical and linguistic ontology learning techniques. The model introduced for the first time two statistical features that have shown to improve the overall score ranking of highly relevant terms for concept enrichment. The chapter also gives some recommendations and possible future research directions based on the discussion in following sections.
APA, Harvard, Vancouver, ISO, and other styles
5

Jovančić, Predrag D., Miloš Tanasijević, Vladimir Milisavljević, Aleksandar Cvjetić, Dejan Ivezić, and Uglješa Srbislav Bugarić. "Applying the Fuzzy Inference Model in Maintenance Centered to Safety." In Advances in Civil and Industrial Engineering, 142–65. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-3904-0.ch009.

Full text
Abstract:
The main idea of this chapter is to promote maintenance centered to safety, in accordance to adaptive fuzzy inference model, which has online adjustment to working conditions. Input data for this model are quality of service indicators of analyzed engineering system: reliability, maintainability, failure consequence, and severity and detectability. Indicators in final form are obtained with permanent monitoring of the engineering system and statistical processing. Level of safety is established by composition and ranking of indicators according to fuzzy inference engine. The problem of monitoring and processing of indicators comprising safety is solved by using the features that Industry4.0 provides. Maintenance centered to safety is important for complex, multi-hierarchy engineering systems. Sudden failures on such systems could have significant financial and environmental effect. Developed model will be tested in the final part of the chapter, in the case study of bucket wheel excavator.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "STATISTICAL FEATURE RANKING"

1

Sharma, Yash, Somya Sharma, and Anshul Arora. "Feature Ranking using Statistical Techniques for Computer Networks Intrusion Detection." In 2022 7th International Conference on Communication and Electronics Systems (ICCES). IEEE, 2022. http://dx.doi.org/10.1109/icces54183.2022.9835831.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Tuan, Pham Minh, Nguyen Linh Trung, Mouloud Adel, and Eric Guedj. "AutoEncoder-based Feature Ranking for Predicting Mild Cognitive Impairment Conversion using FDG-PET Images." In 2023 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2023. http://dx.doi.org/10.1109/ssp53291.2023.10208072.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kumar, Akhil, and Shailender Kumar. "Intrusion detection based on machine learning and statistical feature ranking techniques." In 2023 13th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE, 2023. http://dx.doi.org/10.1109/confluence56041.2023.10048802.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Min, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, and Lizhuang Ma. "DRPose3D: Depth Ranking in 3D Human Pose Estimation." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/136.

Full text
Abstract:
In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation. Instead of accurate 3D positions, the depth ranking can be identified by human intuitively and learned using the deep neural network more easily by solving classification problems. Moreover, depth ranking contains rich 3D information. It prevents the 2D-to-3D pose regression in two-stage methods from being ill-posed. In our method, firstly, we design a Pairwise Ranking Convolutional Neural Network (PRCNN) to extract depth rankings of human joints from images. Secondly, a coarse-to-fine 3D Pose Network(DPNet) is proposed to estimate 3D poses from both depth rankings and 2D human joint locations. Additionally, to improve the generality of our model, we introduce a statistical method to augment depth rankings. Our approach outperforms the state-of-the-art methods in the Human3.6M benchmark for all three testing protocols, indicating that depth ranking is an essential geometric feature which can be learned to improve the 3D pose estimation.
APA, Harvard, Vancouver, ISO, and other styles
5

Tamilarasan, A., S. Mukkamala, A. H. Sung, and K. Yendrapalli. "Feature Ranking and Selection for Intrusion Detection Using Artificial Neural Networks and Statistical Methods." In The 2006 IEEE International Joint Conference on Neural Network Proceedings. IEEE, 2006. http://dx.doi.org/10.1109/ijcnn.2006.247131.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bahrami, Peyman, and Lesley A. James. "Field Production Optimization Using Smart Proxy Modeling; Implementation of Sequential Sampling, Average Feature Ranking, and Convolutional Neural Network." In SPE Canadian Energy Technology Conference and Exhibition. SPE, 2023. http://dx.doi.org/10.2118/212809-ms.

Full text
Abstract:
Abstract This work aims to create an approximation of the reservoir numerical model using smart proxy modeling (SPM) to be used for production optimization. The constructed SPM in this work is further improved in different steps to increase its accuracy and efficiency compared to the existing literature. These steps include sequential sampling, average feature ranking, convolutional neural network (CNN) deep learning modeling, and feature engineering. SPM is a novel methodology that generates results faster than numerical simulators. SPM decouples the mathematical equations of the problem into a numeric dataset and trains a statistical/AI-driven model on the dataset. Major SPM construction steps are: objective, input, and output selection, sampling, running numerical model, extracting new static and dynamic parameters, forming a new dataset, performing feature selection, training and validating the underlying model, and employing the SPM. Unlike traditional proxy modeling, SPM implements feature engineering techniques that generate new static/dynamic parameters. The extracted parameters help to capture hidden patterns within the dataset, eventually increasing SPMs’ accuracy. SPM can either be constructed to predict the grids’ characteristics, called grid-based SPM, or to predict the wells' fluid rates, called well-based SPM. In this work, the well-based SPM is constructed to duplicate the Volve offshore field production results undergoing waterflooding. We used Latin hypercube sampling coupled with genetic algorithm (GA) in the sampling step. The designed parameters to perform sampling are the individual liquid rate of the producers, and the output is the individual well's cumulative oil production. In the formed dataset, various extracted parameters relating to the wells are prepared, such as well types, indexes, trajectories, and cumulative oil production. Furthermore, a grid-based SPM is constructed in parallel to the well-based SPM. At each timestep of the prediction, dynamic parameters relating to grids (in this case: grids’ pressure/saturations) are transferred to the existing well-based dataset. This technique helps the well-based SPM further increase in accuracy by finding new patterns within the dataset. We implement an average of 23 different models to rank, and perform the feature selection process. Finally, the CNN model is trained on the dataset, and is coupled with two derivative-free optimizers of GA and particle swarm optimizer to maximize the oil production over the selected time period. Sequential sampling used in this work is a novel technique to construct the SPM with the lowest number of numerical model executions. It provides an efficient workflow to perform sampling, thereby saving time instead of repeating the whole SPM construction steps. The average feature ranking implemented in this paper provides the best prioritization of input parameters. It provides a confident ranking for the feature selection step. Finally, the underlying CNN model is compared to the prediction accuracy of the ANN model.
APA, Harvard, Vancouver, ISO, and other styles
7

Foreman, Geoff, Steven Bott, Jeffrey Sutherland, and Stephan Tappert. "The Development and Use of an Absolute Depth Size Specification in ILI-Based Crack Integrity Management of Pipelines." In 2016 11th International Pipeline Conference. American Society of Mechanical Engineers, 2016. http://dx.doi.org/10.1115/ipc2016-64224.

Full text
Abstract:
To provide a more insightful and accurate feature description from Crack In-line Inspection (ILI) reporting as per the Fitness For Service analysis in API 1176, individual crack dimensions must be established to a given accuracy. PII Pipeline Solutions established an absolute depth sizing specification conforming to the dig verification processes of API 1163. This change represented a significant shift from a traditional reporting format for depth sizing in “bands” of 1–2 mm, 2–3 mm and > 3 mm depths within crack ILI inspection reporting. When assessing features with characteristics stated in a sizing band, the pipeline integrity assessment approach required a conservative assumptions that all of the features in that band must be treated as if they are in the deepest band value. The implication then meant that the specification created only 3 sizes of crack depths 1–2 mm, 2–3 mm, > 3 mm (± 0.5mm tolerance at 90% certainty). In practical terms a large quantity of features in the significant band of 2–3 mm must be treated as potential dig candidates with a depth of at least 3 mm, making length characteristics as the only severity ranking basis for any priority dig selection. Previous attempts at establishing absolute depth sizing for crack inspection required a series of calibration digs. The large sample size over multiple inspection runs and pipeline sections allowed for a statistical specification algorithm is developed as part of the analysis process, therefore no additional reporting time, or excavation cost was involved. The new absolute sizing algorithm has provided operators with a means of prioritizing digs, based upon individual feature length and depths. Replacing the traditional depth bands with individual feature specific peak depths and thereby providing a major step forward in achieving a cost effective process of prioritizing crack mitigation in pipelines. Following the dig verification process in API 1163, significant populations of infield NDE results were utilized on a variety of pipeline sections of different diameters. Predicted absolute depth estimation accuracy was determined for specific feature types and thereby created a depth tolerance, with statistical certainty levels established that match those available and recognized with metal loss ILI. This paper describes the process and the means by which an absolute depth crack ILI specification was established using characteristics from a significant set of real features. It also describes benefits realized within pipeline integrity engineering of moving to such a new reporting protocol.
APA, Harvard, Vancouver, ISO, and other styles
8

Idogun, Akpevwe Kelvin, Ruth Oyanu Ujah, and Lesley Anne James. "Surrogate-Based Analysis of Chemical Enhanced Oil Recovery – A Comparative Analysis of Machine Learning Model Performance." In SPE Nigeria Annual International Conference and Exhibition. SPE, 2021. http://dx.doi.org/10.2118/208452-ms.

Full text
Abstract:
Abstract Optimizing decision and design variables for Chemical EOR is imperative for sensitivity and uncertainty analysis. However, these processes involve multiple reservoir simulation runs which increase computational cost and time. Surrogate models are capable of overcoming this impediment as they are capable of mimicking the capabilities of full field three-dimensional reservoir simulation models in detail and complexity. Artificial Neural Networks (ANN) and regression-based Design of Experiments (DoE) are common methods for surrogate modelling. In this study, a comparative analysis of data-driven surrogate model performance on Recovery Factor (RF) for Surfactant-Polymer flooding is investigated with seven input variables including Kv/Kh ratio, polymer concentration in polymer drive, surfactant slug size, surfactant concentration in surfactant slug, polymer concentration in surfactant slug, polymer drive size and salinity of polymer drive. Eleven Machine learning models including Multiple Linear Regression (MLR), Ridge and Lasso regression; Support Vector Regression (SVR), ANN as well as Classification and Regression Tree (CART) based algorithms including Decision Trees, Random Forest, eXtreme Gradient Boosting (XGBoost), Gradient Boosting and Extremely Randomized Trees (ERT), are applied on a dataset consisting of 202 datapoints. The results obtained indicate high model performance and accuracy for SVR, ANN and CART based ensemble techniques like Extremely Randomized Trees, Gradient Boost and XGBoost regression, with high R2 values and lowest Mean Squared Error (MSE) values for the training and test dataset. Unlike other studies on Chemical EOR surrogate modelling where sensitivity was analyzed with statistical DoE, we rank the input features using Decision Tree-based algorithms while model interpretability is achieved with Shapely Values. Results from feature ranking indicate that surfactant concentration, and slug size are the most influential parameters on the RF. Other important factors, though with less influence, are the polymer concentration in surfactant slug, polymer concentration in polymer drive and polymer drive size. The salinity of the polymer drive and the Kv/Kh ratio both have a negative effect on the RF, with a corresponding least level of significance.
APA, Harvard, Vancouver, ISO, and other styles
9

Adl, Amin Ahmadi, Xiaoning Qian, Ping Xu, Kendra Vehik, and Jeffrey P. Krischer. "Feature ranking based on synergy networks to identify prognostic markers in DPT-1." In 2012 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS). IEEE, 2012. http://dx.doi.org/10.1109/gensips.2012.6507728.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Len, Przemysław. "The Use Of Statistical Methods in Creation of the Urgency Ranking of the Land Consolidation and Land Exchange Works." In Environmental Engineering. VGTU Technika, 2017. http://dx.doi.org/10.3846/enviro.2017.212.

Full text
Abstract:
In the analyzes of the urgency of the land consolidation and land exchange works, and particularly in the spatial comparative analyzes it is helpful to use methods of multivariate statistics, which allows the determination of synthetic measure. Synthetic measures substitute the large set of attributes of the object with one aggregate variable, allowing ordering the analyzed objects (villages) in terms of the phenomenon in question of the urgency of carrying out the work of consolidation and exchange of land. The aim of the paper is to determine measures for the urgency of carrying out the works of consolidation and exchange of land according to the method proposed by Z. Hellwig and comparison of the obtained results with the results obtained using the zero unitarisation method (ZUM). The aim of the analyzes is to verify (check), how the use of different methods of aggregation of the same diagnostic variables affects the results of research. The subject of the research consists of 14 precincts located in the municipality Białaczów, in the Łódzkie voivodship region. To construct the synthetic measure for the urgency of carrying out the works of consolidation and exchange of land 5 groups of features characterizing the works related to consolidation and exchange of land were adopted.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography