To see the other types of publications on this topic, follow the link: SOFTWARE PREDICTION MODELS.

Journal articles on the topic 'SOFTWARE PREDICTION MODELS'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'SOFTWARE PREDICTION MODELS.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Balogun, A. O., A. O. Bajeh, H. A. Mojeed, and A. G. Akintola. "Software defect prediction: A multi-criteria decision-making approach." Nigerian Journal of Technological Research 15, no. 1 (April 30, 2020): 35–42. http://dx.doi.org/10.4314/njtr.v15i1.7.

Full text
Abstract:
Failure of software systems as a result of software testing is very much rampant as modern software systems are large and complex. Software testing which is an integral part of the software development life cycle (SDLC), consumes both human and capital resources. As such, software defect prediction (SDP) mechanisms are deployed to strengthen the software testing phase in SDLC by predicting defect prone modules or components in software systems. Machine learning models are used for developing the SDP models with great successes achieved. Moreover, some studies have highlighted that a combination of machine learning models as a form of an ensemble is better than single SDP models in terms of prediction accuracy. However, the efficiency of machine learning models can change with diverse predictive evaluation metrics. Thus, more studies are needed to establish the effectiveness of ensemble SDP models over single SDP models. This study proposes the deployment of Multi-Criteria Decision Method (MCDM) techniques to rank machine learning models. Analytic Network Process (ANP) and Preference Ranking Organization Method for Enrichment Evaluation (PROMETHEE) which are types of MCDM techniques are deployed on 9 machine learning models with 11 performance evaluation metrics and 11 software defects datasets. The experimental results showed that ensemble SDP models are best appropriate SDP models as Boosted SMO and Boosted PART ranked highest for each of the MCDM techniques. Besides, the experimental results also validated the stand of not considering accuracy as the only performance evaluation metrics for SDP models. Conclusively, more performance metrics other than predictive accuracy should be considered when ranking and evaluating machine learning models. Keywords: Ensemble; Multi-Criteria Decision Method; Software Defect Prediction
APA, Harvard, Vancouver, ISO, and other styles
2

Malhotra, Ruchika, and Juhi Jain. "Predicting Software Defects for Object-Oriented Software Using Search-based Techniques." International Journal of Software Engineering and Knowledge Engineering 31, no. 02 (February 2021): 193–215. http://dx.doi.org/10.1142/s0218194021500054.

Full text
Abstract:
Development without any defect is unsubstantial. Timely detection of software defects favors the proper resource utilization saving time, effort and money. With the increasing size and complexity of software, demand for accurate and efficient prediction models is increasing. Recently, search-based techniques (SBTs) have fascinated many researchers for Software Defect Prediction (SDP). The goal of this study is to conduct an empirical evaluation to assess the applicability of SBTs for predicting software defects in object-oriented (OO) softwares. In this study, 16 SBTs are exploited to build defect prediction models for 13 OO software projects. Stable performance measures — GMean, Balance and Receiver Operating Characteristic-Area Under Curve (ROC-AUC) are employed to probe into the predictive capability of developed models, taking into consideration the imbalanced nature of software datasets. Proper measures are taken to handle the stochastic behavior of SBTs. The significance of results is statistically validated using the Friedman test complied with Wilcoxon post hoc analysis. The results confirm that software defects can be detected in the early phases of software development with help of SBTs. This paper identifies the effective subset of SBTs that will aid software practitioners to timely detect the probable software defects, therefore, saving resources and bringing up good quality softwares. Eight SBTs — sUpervised Classification System (UCS), Bioinformatics-oriented hierarchical evolutionary learning (BIOHEL), CHC, Genetic Algorithm-based Classifier System with Adaptive Discretization Intervals (GA_ADI), Genetic Algorithm-based Classifier System with Intervalar Rule (GA_INT), Memetic Pittsburgh Learning Classifier System (MPLCS), Population-Based Incremental Learning (PBIL) and Steady-State Genetic Algorithm for Instance Selection (SGA) are found to be statistically good defect predictors.
APA, Harvard, Vancouver, ISO, and other styles
3

Vandecruys, Olivier, David Martens, Bart Baesens, Christophe Mues, Manu De Backer, and Raf Haesen. "Mining software repositories for comprehensible software fault prediction models." Journal of Systems and Software 81, no. 5 (May 2008): 823–39. http://dx.doi.org/10.1016/j.jss.2007.07.034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zaim, Amirul, Johanna Ahmad, Noor Hidayah Zakaria, Goh Eg Su, and Hidra Amnur. "Software Defect Prediction Framework Using Hybrid Software Metric." JOIV : International Journal on Informatics Visualization 6, no. 4 (December 31, 2022): 921. http://dx.doi.org/10.30630/joiv.6.4.1258.

Full text
Abstract:
Software fault prediction is widely used in the software development industry. Moreover, software development has accelerated significantly during this epidemic. However, the main problem is that most fault prediction models disregard object-oriented metrics, and even academician researcher concentrate on predicting software problems early in the development process. This research highlights a procedure that includes an object-oriented metric to predict the software fault at the class level and feature selection techniques to assess the effectiveness of the machine learning algorithm to predict the software fault. This research aims to assess the effectiveness of software fault prediction using feature selection techniques. In the present work, software metric has been used in defect prediction. Feature selection techniques were included for selecting the best feature from the dataset. The results show that process metric had slightly better accuracy than the code metric.
APA, Harvard, Vancouver, ISO, and other styles
5

Kalouptsoglou, Ilias, Miltiadis Siavvas, Dionysios Kehagias, Alexandros Chatzigeorgiou, and Apostolos Ampatzoglou. "Examining the Capacity of Text Mining and Software Metrics in Vulnerability Prediction." Entropy 24, no. 5 (May 5, 2022): 651. http://dx.doi.org/10.3390/e24050651.

Full text
Abstract:
Software security is a very important aspect for software development organizations who wish to provide high-quality and dependable software to their consumers. A crucial part of software security is the early detection of software vulnerabilities. Vulnerability prediction is a mechanism that facilitates the identification (and, in turn, the mitigation) of vulnerabilities early enough during the software development cycle. The scientific community has recently focused a lot of attention on developing Deep Learning models using text mining techniques for predicting the existence of vulnerabilities in software components. However, there are also studies that examine whether the utilization of statically extracted software metrics can lead to adequate Vulnerability Prediction Models. In this paper, both software metrics- and text mining-based Vulnerability Prediction Models are constructed and compared. A combination of software metrics and text tokens using deep-learning models is examined as well in order to investigate if a combined model can lead to more accurate vulnerability prediction. For the purposes of the present study, a vulnerability dataset containing vulnerabilities from real-world software products is utilized and extended. The results of our analysis indicate that text mining-based models outperform software metrics-based models with respect to their F2-score, whereas enriching the text mining-based models with software metrics was not found to provide any added value to their predictive performance.
APA, Harvard, Vancouver, ISO, and other styles
6

Shatnawi, Raed. "Software fault prediction using machine learning techniques with metric thresholds." International Journal of Knowledge-based and Intelligent Engineering Systems 25, no. 2 (July 26, 2021): 159–72. http://dx.doi.org/10.3233/kes-210061.

Full text
Abstract:
BACKGROUND: Fault data is vital to predicting the fault-proneness in large systems. Predicting faulty classes helps in allocating the appropriate testing resources for future releases. However, current fault data face challenges such as unlabeled instances and data imbalance. These challenges degrade the performance of the prediction models. Data imbalance happens because the majority of classes are labeled as not faulty whereas the minority of classes are labeled as faulty. AIM: The research proposes to improve fault prediction using software metrics in combination with threshold values. Statistical techniques are proposed to improve the quality of the datasets and therefore the quality of the fault prediction. METHOD: Threshold values of object-oriented metrics are used to label classes as faulty to improve the fault prediction models The resulting datasets are used to build prediction models using five machine learning techniques. The use of threshold values is validated on ten large object-oriented systems. RESULTS: The models are built for the datasets with and without the use of thresholds. The combination of thresholds with machine learning has improved the fault prediction models significantly for the five classifiers. CONCLUSION: Threshold values can be used to label software classes as fault-prone and can be used to improve machine learners in predicting the fault-prone classes.
APA, Harvard, Vancouver, ISO, and other styles
7

Eldho, K. J. "Impact of Unbalanced Classification on the Performance of Software Defect Prediction Models." Indian Journal of Science and Technology 15, no. 6 (February 15, 2022): 237–42. http://dx.doi.org/10.17485/ijst/v15i6.2193.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Karunanithi, N., D. Whitley, and Y. K. Malaiya. "Prediction of software reliability using connectionist models." IEEE Transactions on Software Engineering 18, no. 7 (July 1992): 563–74. http://dx.doi.org/10.1109/32.148475.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Fenton, N. E., and M. Neil. "A critique of software defect prediction models." IEEE Transactions on Software Engineering 25, no. 5 (1999): 675–89. http://dx.doi.org/10.1109/32.815326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Lawson, John S., Craig W. Wesselman, and Del T. Scott. "Simple Plots Improve Software Reliability Prediction Models." Quality Engineering 15, no. 3 (April 2003): 411–17. http://dx.doi.org/10.1081/qen-120018040.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Radliński, Łukasz. "The Impact of Data Quality on Software Testing Effort Prediction." Electronics 12, no. 7 (March 31, 2023): 1656. http://dx.doi.org/10.3390/electronics12071656.

Full text
Abstract:
Background: This paper investigates the impact of data quality on the performance of models predicting effort on software testing. Data quality was reflected by training data filtering strategies (data variants) covering combinations of Data Quality Rating, UFP Rating, and a threshold of valid cases. Methods: The experiment used the ISBSG dataset and 16 machine learning models. A process of three-fold cross-validation repeated 20 times was used to train and evaluate each model with each data variant. Model performance was assessed using absolute errors of prediction. A ‘win–tie–loss’ procedure, based on the Wilcoxon signed-rank test, was applied to identify the best models and data variants. Results: Most models, especially the most accurate, performed the best on a complete dataset, even though it contained cases with low data ratings. The detailed results include the rankings of the following: (1) models for particular data variants, (2) data variants for particular models, and (3) the best-performing combinations of models and data variants. Conclusions: Arbitrary and restrictive data selection to only projects with Data Quality Rating and UFP Rating of ‘A’ or ‘B’, commonly used in the literature, does not seem justified. It is recommended not to exclude cases with low data ratings to achieve better accuracy of most predictive models for testing effort prediction.
APA, Harvard, Vancouver, ISO, and other styles
12

GANESAN, K., TAGHI M. KHOSHGOFTAAR, and EDWARD B. ALLEN. "CASE-BASED SOFTWARE QUALITY PREDICTION." International Journal of Software Engineering and Knowledge Engineering 10, no. 02 (April 2000): 139–52. http://dx.doi.org/10.1142/s0218194000000092.

Full text
Abstract:
Highly reliable software is becoming an essential ingredient in many systems. However, assuring reliability often entails time-consuming costly development processes. One cost-effective strategy is to target reliability-enhancement activities to those modules that are likely to have the most problems. Software quality prediction models can predict the number of faults expected in each module early enough for reliability enhancement to be effective. This paper introduces a case-based reasoning technique for the prediction of software quality factors. Case-based reasoning is a technique that seeks to answer new problems by identifying similar "cases" from the past. A case-based reasoning system can function as a software quality prediction model. To our knowledge, this study is the first to use case-based reasoning systems for predicting quantitative measures of software quality. A case study applied case-based reasoning to software quality modeling of a family of full-scale industrial software systems. The case-based reasoning system's accuracy was much better than a corresponding multiple linear regression model in predicting the number of design faults. When predicting faults in code, its accuracy was significantly better than a corresponding multiple linear regression model for two of three test data sets and statistically equivalent for the third.
APA, Harvard, Vancouver, ISO, and other styles
13

Alsolai, Hadeel, and Marc Roper. "The Impact of Ensemble Techniques on Software Maintenance Change Prediction: An Empirical Study." Applied Sciences 12, no. 10 (May 22, 2022): 5234. http://dx.doi.org/10.3390/app12105234.

Full text
Abstract:
Various prediction models have been proposed by researchers to predict the change-proneness of classes based on source code metrics. However, some of these models suffer from low prediction accuracy because datasets exhibit high dimensionality or imbalanced classes. Recent studies suggest that using ensembles to integrate several models, select features, or perform sampling has the potential to resolve issues in the datasets and improve the prediction accuracy. This study aims to empirically evaluate the effectiveness of the ensemble models, feature selection, and sampling techniques on predicting change-proneness using different metrics. We conduct an empirical study to compare the performance of four machine learning models (naive Bayes, support vector machines, k-nearest neighbors, and random forests) on seven datasets for predicting change-proneness. We use two types of feature selection (relief and Pearson’s correlation coefficient) and three types of ensemble sampling techniques, which integrate different types of sampling techniques (SMOTE, spread sub-sample, and randomize). The results of this study reveal that the ensemble feature selection and sampling techniques yield improved prediction accuracy over most of the investigated models, and using sampling techniques increased the prediction accuracy of all models. Random forests provide a significant improvement over other prediction models and obtained the highest value of the average of the area under curve in all scenarios. The proposed ensemble feature selection and sampling techniques, along with the ensemble model (random forests), were found beneficial in improving the prediction accuracy of change-proneness.
APA, Harvard, Vancouver, ISO, and other styles
14

Yang, Xinli, Jingjing Liu, and Denghui Zhang. "A Comprehensive Taxonomy for Prediction Models in Software Engineering." Information 14, no. 2 (February 10, 2023): 111. http://dx.doi.org/10.3390/info14020111.

Full text
Abstract:
Applying prediction models to software engineering is an interesting research area. There have been many related studies which leverage prediction models to achieve good performance in various software engineering tasks. With more and more researches in software engineering leverage prediction models, there is a need to sort out related studies, aiming to summarize which software engineering tasks prediction models can apply to and how to better leverage prediction models in these tasks. This article conducts a comprehensive taxonomy on prediction models applied to software engineering. We review 136 papers from top conference proceedings and journals in the last decade and summarize 11 research topics prediction models can apply to. Based on the papers, we conclude several big challenges and directions. We believe that the comprehensive taxonomy will help us understand the research area deeper and infer several useful and practical implications.
APA, Harvard, Vancouver, ISO, and other styles
15

CHALLAGULLA, VENKATA UDAYA B., FAROKH B. BASTANI, I.-LING YEN, and RAYMOND A. PAUL. "EMPIRICAL ASSESSMENT OF MACHINE LEARNING BASED SOFTWARE DEFECT PREDICTION TECHNIQUES." International Journal on Artificial Intelligence Tools 17, no. 02 (April 2008): 389–400. http://dx.doi.org/10.1142/s0218213008003947.

Full text
Abstract:
Automated reliability assessment is essential for systems that entail dynamic adaptation based on runtime mission-specific requirements. One approach along this direction is to monitor and assess the system using machine learning-based software defect prediction techniques. Due to the dynamic nature of software data collected, Instance-based learning algorithms are proposed for the above purposes. To evaluate the accuracy of these methods, the paper presents an empirical analysis of four different real-time software defect data sets using different predictor models. The results show that a combination of 1R and Instance-based learning along with Consistency-based subset evaluation technique provides a relatively better consistency in achieving accurate predictions as compared with other models. No direct relationship is observed between the skewness present in the data sets and the prediction accuracy of these models. Principal Component Analysis (PCA) does not show a consistent advantage in improving the accuracy of the predictions. While random reduction of attributes gave poor accuracy results, simple Feature Subset Selection methods performed better than PCA for most prediction models. Based on these results, the paper presents a high-level design of an Intelligent Software Defect Analysis tool (ISDAT) for dynamic monitoring and defect assessment of software modules.
APA, Harvard, Vancouver, ISO, and other styles
16

John, Boby. "A Brief Review of Software Reliability Prediction Models." International Journal for Research in Applied Science and Engineering Technology V, no. IV (April 27, 2017): 990–97. http://dx.doi.org/10.22214/ijraset.2017.4180.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Schneidewind, Norman. "Experience with Risk-Based Software Defect Prediction Models." Journal of Aerospace Computing, Information, and Communication 4, no. 1 (January 2007): 619–27. http://dx.doi.org/10.2514/1.26507.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Koru, A. G., and Hongfang Liu. "Building Defect Prediction Models in Practice." IEEE Software 22, no. 6 (November 2005): 23–29. http://dx.doi.org/10.1109/ms.2005.149.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Jiang, Yue, Bojan Cukic, and Yan Ma. "Techniques for evaluating fault prediction models." Empirical Software Engineering 13, no. 5 (August 12, 2008): 561–95. http://dx.doi.org/10.1007/s10664-008-9079-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Malhotra, Ruchika, and Juhi Jain. "Predicting defects in imbalanced data using resampling methods: an empirical investigation." PeerJ Computer Science 8 (April 29, 2022): e573. http://dx.doi.org/10.7717/peerj-cs.573.

Full text
Abstract:
The development of correct and effective software defect prediction (SDP) models is one of the utmost needs of the software industry. Statistics of many defect-related open-source data sets depict the class imbalance problem in object-oriented projects. Models trained on imbalanced data leads to inaccurate future predictions owing to biased learning and ineffective defect prediction. In addition to this large number of software metrics degrades the model performance. This study aims at (1) identification of useful metrics in the software using correlation feature selection, (2) extensive comparative analysis of 10 resampling methods to generate effective machine learning models for imbalanced data, (3) inclusion of stable performance evaluators—AUC, GMean, and Balance and (4) integration of statistical validation of results. The impact of 10 resampling methods is analyzed on selected features of 12 object-oriented Apache datasets using 15 machine learning techniques. The performances of developed models are analyzed using AUC, GMean, Balance, and sensitivity. Statistical results advocate the use of resampling methods to improve SDP. Random oversampling portrays the best predictive capability of developed defect prediction models. The study provides a guideline for identifying metrics that are influential for SDP. The performances of oversampling methods are superior to undersampling methods.
APA, Harvard, Vancouver, ISO, and other styles
21

Ma, Baojun, Huaping Zhang, Guoqing Chen, Yanping Zhao, and Bart Baesens. "Investigating Associative Classification for Software Fault Prediction: An Experimental Perspective." International Journal of Software Engineering and Knowledge Engineering 24, no. 01 (February 2014): 61–90. http://dx.doi.org/10.1142/s021819401450003x.

Full text
Abstract:
It is a recurrent finding that software development is often troubled by considerable delays as well as budget overruns and several solutions have been proposed in answer to this observation, software fault prediction being a prime example. Drawing upon machine learning techniques, software fault prediction tries to identify upfront software modules that are most likely to contain faults, thereby streamlining testing efforts and improving overall software quality. When deploying fault prediction models in a production environment, both prediction performance and model comprehensibility are typically taken into consideration, although the latter is commonly overlooked in the academic literature. Many classification methods have been suggested to conduct fault prediction; yet associative classification methods remain uninvestigated in this context. This paper proposes an associative classification (AC)-based fault prediction method, building upon the CBA2 algorithm. In an empirical comparison on 12 real-world datasets, the AC-based classifier is shown to achieve a predictive performance competitive to those of models induced by five other tree/rule-based classification techniques. In addition, our findings also highlight the comprehensibility of the AC-based models, while achieving similar prediction performance. Furthermore, the possibilities of cross project prediction are investigated, strengthening earlier findings on the feasibility of such approach when insufficient data on the target project is available.
APA, Harvard, Vancouver, ISO, and other styles
22

Mariño, Perfecto, Francisco Poza, Santiago Otero, and Fernando Machado. "Multidisciplinary Software Developments in a Power Transformers Scenario." Key Engineering Materials 293-294 (September 2005): 635–42. http://dx.doi.org/10.4028/www.scientific.net/kem.293-294.635.

Full text
Abstract:
Power transformers’ failures carry great costs to electric companies. To diminish this problem in four working 40 MVA transformers, the authors have implemented the measurement system of a failure prediction tool, which is the basis of a predictive maintenance infrastructure. The prediction models obtain their inputs from sensors, whose values must be conditioned, sampled and filtered before feeding the forecasting algorithms. Applying Data Warehouse tech- niques, the models have been provided with an abstraction of sensors the authors have called Virtual Cards. By means of these virtual devices, models have access to clean data, both fresh and historic, from the set of sensors they need. Besides, several characteristics of the data flow coming from the Virtual Cards, such as the sample rate or the set of sensors itself, can be dynamically reconfigured. A replication scheme was implemented to allow the distribution of demanding processing tasks and the remote management of the prediction applications.
APA, Harvard, Vancouver, ISO, and other styles
23

Kumar, Ajay, and Kamaldeep Kaur. "SOM-FTS: A Hybrid Model for Software Reliability Prediction and MCDM-Based Evaluation." International Journal of Engineering and Technology Innovation 12, no. 4 (June 27, 2022): 308–21. http://dx.doi.org/10.46604/ijeti.2022.8546.

Full text
Abstract:
The objective of this study is to propose a hybrid model based on self-organized maps (SOM) and fuzzy time series (FTS) for predicting the reliability of software systems. The proposed SOM-FTS model is compared with eleven traditional machine learning-based models. The problem of selecting a suitable software reliability prediction model is represented as a multi-criteria decision-making (MCDM) problem. Twelve software reliability prediction models, including the proposed SOM-FTS model, are evaluated using three MCDM methods, four performance measures, and three software failure datasets. The results show that the proposed SOM-FTS model is the most suitable model among the twelve software reliability prediction models on the basis of MCDM ranking.
APA, Harvard, Vancouver, ISO, and other styles
24

Canaparo, Marco, and Elisabetta Ronchieri. "Data Mining Techniques for Software Quality Prediction in Open Source Software." EPJ Web of Conferences 214 (2019): 05007. http://dx.doi.org/10.1051/epjconf/201921405007.

Full text
Abstract:
Software quality monitoring and analysis are among the most productive topics in software engineering research. Their results may be effectively employed by engineers during software development life cycle. Open source software constitutes a valid test case for the assessment of software characteristics. The data mining approach has been proposed in literature to extract software characteristics from software engineering data. This paper aims at comparing diverse data mining techniques (e.g., derived from machine learning) for developing effective software quality prediction models. To achieve this goal, we tackled various issues, such as the collection of software metrics from open source repositories, the assessment of prediction models to detect software issues and the adoption of statistical methods to evaluate data mining techniques. The results of this study aspire to identify the data mining techniques that perform better amongst all the ones used in this paper for software quality prediction models.
APA, Harvard, Vancouver, ISO, and other styles
25

JIANG, YUE, BOJAN CUKIC, TIM MENZIES, and JIE LIN. "INCREMENTAL DEVELOPMENT OF FAULT PREDICTION MODELS." International Journal of Software Engineering and Knowledge Engineering 23, no. 10 (December 2013): 1399–425. http://dx.doi.org/10.1142/s0218194013500447.

Full text
Abstract:
The identification of fault-prone modules has a significant impact on software quality assurance. In addition to prediction accuracy, one of the most important goals is to detect fault prone modules as early as possible in the development lifecycle. Requirements, design, and code metrics have been successfully used for predicting fault-prone modules. In this paper, we investigate the benefits of the incremental development of software fault prediction models. We compare the performance of these models as the volume of data and their life cycle origin (design, code, or their combination) evolve during project development. We analyze 14 data sets from publicly available software engineering data repositories. These data sets offer both design and code metrics. Using a number of modeling techniques and statistical significance tests, we confirm that increasing the volume of training data improves model performance. Further models built from code metrics typically outperform those that are built using design metrics only. However, both types of models prove to be useful as they can be constructed in different phases of the life cycle. Code-based models can be used to increase the effectiveness of assigning verification and validation activities late in the development life cycle. We also conclude that models that utilize a combination of design and code level metrics outperform models which use either one metric set exclusively.
APA, Harvard, Vancouver, ISO, and other styles
26

SCHNEIDEWIND, NORMAN. "SOFTWARE RISK ANALYSIS." International Journal of Reliability, Quality and Safety Engineering 16, no. 02 (April 2009): 117–36. http://dx.doi.org/10.1142/s0218539309003320.

Full text
Abstract:
There has been a lack of attention to the subject of risk management in the design and operation of software. This is strange because the risk to reliability is a critical problem in attempts to achieve a safe operation of the software. To address this problem, we evaluate existing models and introduce a new model for software risk prediction. The new model — cumulative failures gradient function — is based on the principles of neural networks. This metric identifiers the minimum test time required to achieve maximum improvement in software quality. We used three NASA Space Shuttle software systems in the evaluation of both existing and new models. The results showed that it was not possible to consistently rank these systems because the validity of the risk predictions varied depending on the risk model that was used. Therefore, the results suggest that it is advisable to use a variety of models to comprehensively evaluate the software risk.
APA, Harvard, Vancouver, ISO, and other styles
27

Islam, Mohammad Rubyet, and Peter Sandborn. "Demonstration of a Response Time Based Remaining Useful Life (RUL) Prediction for Software Systems." Journal of Prognostics and Health Management 3, no. 1 (February 25, 2023): 9–36. http://dx.doi.org/10.22215/jphm.v3i1.3641.

Full text
Abstract:
Prognostic and Health Management (PHM) has been widely applied to hardware systems in the electronics and non-electronics domains but has not been explored for software. While software does not decay over time, it can degrade over release cycles. Software health management is confined to diagnostic assessments that identify problems, whereas prognostic assessment potentially indicates when in the future a problem will become detrimental. Relevant research areas such as software defect prediction, software reliability prediction, predictive maintenance of software, software degradation, and software performance prediction, exist, but all of these represent diagnostic models built upon historical data – none of which can predict an RUL for software. This paper addresses the application of PHM concepts to software systems for fault predictions and RUL estimation. Specifically, this paper addresses how PHM can be used to make decisions for software systems such as version update/upgrade, module changes, system re-engineering, rejuvenation, maintenance scheduling, budgeting, and total abandonment. This paper presents a method to prognostically and continuously predict the RUL of a software system based on usage parameters (e.g., the numbers and categories of releases) and performance parameters (e.g., response time). The model developed has been validated by comparing actual data, with the results that were generated by predictive models. Statistical validation (regression validation, and k-fold cross validation) has also been carried out. A case study, based on publicly available data for the Bugzilla application is presented. This case study demonstrates that PHM concepts can be applied to software systems and RUL can be calculated to make system management decisions.
APA, Harvard, Vancouver, ISO, and other styles
28

Gradišnik, Mitja, Tina Beranič, and Sašo Karakatič. "Impact of Historical Software Metric Changes in Predicting Future Maintainability Trends in Open-Source Software Development." Applied Sciences 10, no. 13 (July 3, 2020): 4624. http://dx.doi.org/10.3390/app10134624.

Full text
Abstract:
Software maintenance is one of the key stages in the software lifecycle and it includes a variety of activities that consume the significant portion of the costs of a software project. Previous research suggest that future software maintainability can be predicted, based on various source code aspects, but most of the research focuses on the prediction based on the present state of the code and ignores its history. While taking the history into account in software maintainability prediction seems intuitive, the research empirically testing this has not been done, and is the main goal of this paper. This paper empirically evaluates the contribution of historical measurements of the Chidamber & Kemerer (C&K) software metrics to software maintainability prediction models. The main contribution of the paper is the building of the prediction models with classification and regression trees and random forest learners in iterations by adding historical measurement data extracted from previous releases gradually. The maintainability prediction models were built based on software metric measurements obtained from real-world open-source software projects. The analysis of the results show that an additional amount of historical metric measurements contributes to the maintainability prediction. Additionally, the study evaluates the contribution of individual C&K software metrics on the performance of maintainability prediction models.
APA, Harvard, Vancouver, ISO, and other styles
29

Timonidis, Nestor, Rembrandt Bakker, and Paul Tiesinga. "Prediction of a Cell-Class-Specific Mouse Mesoconnectome Using Gene Expression Data." Neuroinformatics 18, no. 4 (May 24, 2020): 611–26. http://dx.doi.org/10.1007/s12021-020-09471-x.

Full text
Abstract:
Abstract Reconstructing brain connectivity at sufficient resolution for computational models designed to study the biophysical mechanisms underlying cognitive processes is extremely challenging. For such a purpose, a mesoconnectome that includes laminar and cell-class specificity would be a major step forward. We analyzed the ability of gene expression patterns to predict cell-class and layer-specific projection patterns and assessed the functional annotations of the most predictive groups of genes. To achieve our goal we used publicly available volumetric gene expression and connectivity data and we trained computational models to learn and predict cell-class and layer-specific axonal projections using gene expression data. Predictions were done in two ways, namely predicting projection strengths using the expression of individual genes and using the co-expression of genes organized in spatial modules, as well as predicting binary forms of projection. For predicting the strength of projections, we found that ridge (L2-regularized) regression had the highest cross-validated accuracy with a median r2 score of 0.54 which corresponded for binarized predictions to a median area under the ROC value of 0.89. Next, we identified 200 spatial gene modules using a dictionary learning and sparse coding approach. We found that these modules yielded predictions of comparable accuracy, with a median r2 score of 0.51. Finally, a gene ontology enrichment analysis of the most predictive gene groups resulted in significant annotations related to postsynaptic function. Taken together, we have demonstrated a prediction workflow that can be used to perform multimodal data integration to improve the accuracy of the predicted mesoconnectome and support other neuroscience use cases.
APA, Harvard, Vancouver, ISO, and other styles
30

Siswantoro, Muhammad Zain Fawwaz Nuruddin, and Umi Laili Yuhana. "Software Defect Prediction Based on Optimized Machine Learning Models: A Comparative Study." Teknika 12, no. 2 (June 30, 2023): 166–72. http://dx.doi.org/10.34148/teknika.v12i2.634.

Full text
Abstract:
Software defect prediction is crucial used for detecting possible defects in software before they manifest. While machine learning models have become more prevalent in software defect prediction, their effectiveness may vary based on the dataset and hyperparameters of the model. Difficulties arise in determining the most suitable hyperparameters for the model, as well as identifying the prominent features that serve as input to the classifier. This research aims to evaluate various traditional machine learning models that are optimized for software defect prediction on NASA MDP (Metrics Data Program) datasets. The datasets were classified using k-nearest neighbors (k-NN), decision trees, logistic regression, linear discriminant analysis (LDA), single hidden layer multilayer perceptron (SHL-MLP), and Support Vector Machine (SVM). The hyperparameters of the models were fine-tuned using random search, and the feature dimensionality was decreased by utilizing principal component analysis (PCA). The synthetic minority oversampling technique (SMOTE) was implemented to oversample the minority class in order to correct the class imbalance. k-NN was found to be the most suitable for software defect prediction on several datasets, while SHL-MLP and SVM were also effective on certain datasets. It is noteworthy that logistic regression and LDA did not perform as well as the other models. Moreover, the optimized models outperform the baseline models in terms of classification accuracy. The choice of model for software defect prediction should be based on the specific characteristics of the dataset. Furthermore, hyperparameter tuning can improve the accuracy of machine learning models in predicting software defects.
APA, Harvard, Vancouver, ISO, and other styles
31

Zighed, Narimane, Nora Bounour, and Abdelhak-Djamel Seriai. "Comparative Analysis of Object-Oriented Software Maintainability Prediction Models." Foundations of Computing and Decision Sciences 43, no. 4 (December 1, 2018): 359–74. http://dx.doi.org/10.1515/fcds-2018-0018.

Full text
Abstract:
Abstract Software maintainability is one of the most important aspects when evaluating the quality of a software product. It is defined as the ease with which the existing software can be modified. In the literature, several researchers have proposed a large number of models to measure and predict maintainability throughout different phases of the Software Development Life Cycle. However, only a few attempts have been made for conducting a comparative study of the existent proposed prediction models. In this paper, we present a detailed classification and conduct a comparative analysis of Object-Oriented software maintainability prediction models. Furthermore, we considered the aforementioned proposed models from three perspectives, which are architecture, design and code levels. To the best of our knowledge, such an analysis that comprises the three levels has not been conducted in previous research. Moreover, this study hints at certain fundamental basics concerning the way of how measure the maintainability knowing that at each level the maintainability will be measured differently. In addition, we will focus on the strengths and weaknesses of these models. Consequently, the comparative study yields that several statistical and machine learning techniques have been employed for software maintainability prediction at code level during the last decade, and each technique possesses its specific characteristic to develop an accurate prediction model. At the design level, the majority of the prediction models measured maintainability according to the characteristics of the quality models. Whereas at the architectural level, the techniques adopted are still limited and only a few of studies have been conducted in this regard.
APA, Harvard, Vancouver, ISO, and other styles
32

Diwan, Sinan, and Abdul Syukor Mohamad. "Machine Learning Empowered Software Prediction System." Wasit Journal of Computer and Mathematics Science 1, no. 3 (October 1, 2022): 54–64. http://dx.doi.org/10.31185/wjcm.61.

Full text
Abstract:
Prediction of software defects is one of the most active study fields in software engineering today. Using a defect prediction model, a list of code prone to defects may be compiled. Using a defect prediction model, software may be made more reliable by identifying and discovering faults before or during the software enhancement process. Defect prediction will play an increasingly important role in the design process as the scope of software projects grows. Bugs or the number of bugs used to measure the performance of a defect prediction procedure are referred to as "bugs" in this context. Defect prediction models can incorporate a wide range of metrics, including source code and process measurements. Defects are determined using a variety of models. Using machine learning, the defect prediction model may be developed. Machine inclining in the second and third levels is dependent on the preparation and assessment of data (to break down model execution). Defect prediction models typically use 90 percent preparation information and 10 percent testing information. Improve prediction performance with the use of dynamic/semi-directed taking in, a machine learning approach. So that the results and conclusion may be sharply defined under many circumstances and factors, it is possible to establish a recreated domain to house the entire method. Computer-aided engineering (CAE) is being used to identify software defects in the context of neural networks. Neural network-based software fault prediction is compared to fuzzy logic fundamental results in this research paper. On numerous parameters, neural network training provides better and more effective outcomes, according to the recommended findings and outputs.
APA, Harvard, Vancouver, ISO, and other styles
33

Desai, Bhoushika, and Roopesh Kevin Sungkur. "Software Quality Prediction Using Machine Learning." International Journal of Software Innovation 10, no. 1 (January 2022): 1–35. http://dx.doi.org/10.4018/ijsi.297997.

Full text
Abstract:
With the emergence of Machine Learning, many companies are increasingly embracing this revolutionary approach, both in terms of growth and maintenance, to reduce software costs. This research aimed at building two models which is Software Defect Prediction Model (SDPM) which will be used to predict defects in software and Software Maintainability Prediction Model (SMPM) which will be used for Software Maintainability. Different classifiers, namely Random Forest, Decision Tree, Naïve Bayes and Artificial Neural Networks have been considered and then evaluated using different metrics such as Accuracy, Precision, Recall and Area Under the Curve (AUC). The two models have successfully been evaluated and Decision Tree has been chosen as compared to other classifiers which tends to perform much better. Finally a framework based on a set of guidelines that can be used to improve software quality has been devised.
APA, Harvard, Vancouver, ISO, and other styles
34

Hong, Euy-Seok. "Taxonomy Framework for Metric-based Software Quality Prediction Models." Journal of the Korea Contents Association 10, no. 6 (June 28, 2010): 134–43. http://dx.doi.org/10.5392/jkca.2010.10.6.134.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

MURAKAMI, Yukasa, Masateru TSUNODA, and Koji TODA. "Evaluation of Software Fault Prediction Models Considering Faultless Cases." IEICE Transactions on Information and Systems E103.D, no. 6 (June 1, 2020): 1319–27. http://dx.doi.org/10.1587/transinf.2019kbp0019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Chamoli, Shilpee, Gil Tenne, and Sanjay Bhatia. "Analysing Software Metrics for Accurate Dynamic Defect Prediction Models." Indian Journal of Science and Technology 8, S4 (February 1, 2015): 96. http://dx.doi.org/10.17485/ijst/2015/v8is4/63111.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Syeed, M. M. Mahbubul, Imed Hammouda, and Tarja Systä. "Prediction Models and Techniques for Open Source Software Projects." International Journal of Open Source Software and Processes 5, no. 2 (April 2014): 1–39. http://dx.doi.org/10.4018/ijossp.2014040101.

Full text
Abstract:
Open Source Software (OSS) is currently a widely adopted approach to developing and distributing software. For effective adoption of OSS, fundamental knowledge of project development is needed. This often calls for reliable prediction models to simulate project evolution and to envision project future. These models provide help in supporting preventive maintenance and building quality software. This paper reports on a systematic literature survey aimed at the identification and structuring of research that offer prediction models and techniques in analyzing OSS projects. In this review, we systematically selected and reviewed 52 peer reviewed articles that were published between January, 2000 and March, 2013. The study outcome provides insight in what constitutes the main contributions of the field, identifies gaps and opportunities, and distills several important future research directions.
APA, Harvard, Vancouver, ISO, and other styles
38

Mahesha, Pandit, and Gupta Deepali. "Performance of Genetic Programming-based Software Defect Prediction Models." International Journal of Performability Engineering 17, no. 9 (2021): 787. http://dx.doi.org/10.23940/ijpe.21.09.p5.787795.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

LaMotte, Lynn R., and Jeffrey D. Wells. "Inverse prediction for heteroscedastic response using mixed models software." Communications in Statistics - Simulation and Computation 46, no. 6 (January 25, 2017): 4490–98. http://dx.doi.org/10.1080/03610918.2015.1118508.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Miyazaki, Y., A. Takanou, H. Nozaki, N. Nakagawa, and K. Okada. "Method to estimate parameter values in software prediction models." Information and Software Technology 33, no. 3 (April 1991): 239–43. http://dx.doi.org/10.1016/0950-5849(91)90139-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

LaMotte, Lynn R., and Jeffrey D. Wells. "Inverse prediction for multivariate mixed models with standard software." Statistical Papers 57, no. 4 (August 4, 2016): 929–38. http://dx.doi.org/10.1007/s00362-016-0815-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Balogun, Abdullateef Oluwagbemiga, Shuib Basri, Said Jadid Abdulkadir, and Ahmad Sobri Hashim. "Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach." Applied Sciences 9, no. 13 (July 9, 2019): 2764. http://dx.doi.org/10.3390/app9132764.

Full text
Abstract:
Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models, and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance.
APA, Harvard, Vancouver, ISO, and other styles
43

Mabayoje, Modinat Abolore, Abdullateef Olwagbemiga Balogun, Hajarah Afor Jibril, Jelili Olaniyi Atoyebi, Hammed Adeleye Mojeed, and Victor Elijah Adeyemo. "Parameter tuning in KNN for software defect prediction: an empirical analysis." Jurnal Teknologi dan Sistem Komputer 7, no. 4 (August 10, 2019): 121–26. http://dx.doi.org/10.14710/jtsiskom.7.4.2019.121-126.

Full text
Abstract:
Software Defect Prediction (SDP) provides insights that can help software teams to allocate their limited resources in developing software systems. It predicts likely defective modules and helps avoid pitfalls that are associated with such modules. However, these insights may be inaccurate and unreliable if parameters of SDP models are not taken into consideration. In this study, the effect of parameter tuning on the k nearest neighbor (k-NN) in SDP was investigated. More specifically, the impact of varying and selecting optimal k value, the influence of distance weighting and the impact of distance functions on k-NN. An experiment was designed to investigate this problem in SDP over 6 software defect datasets. The experimental results revealed that k value should be greater than 1 (default) as the average RMSE values of k-NN when k>1(0.2727) is less than when k=1(default) (0.3296). In addition, the predictive performance of k-NN with distance weighing improved by 8.82% and 1.7% based on AUC and accuracy respectively. In terms of the distance function, kNN models based on Dilca distance function performed better than the Euclidean distance function (default distance function). Hence, we conclude that parameter tuning has a positive effect on the predictive performance of k-NN in SDP.
APA, Harvard, Vancouver, ISO, and other styles
44

Goyal, Somya, and Pradeep Kumar Bhatia. "Comparison of Machine Learning Techniques for Software Quality Prediction." International Journal of Knowledge and Systems Science 11, no. 2 (April 2020): 20–40. http://dx.doi.org/10.4018/ijkss.2020040102.

Full text
Abstract:
Software quality prediction is one the most challenging tasks in the development and maintenance of software. Machine learning (ML) is widely being incorporated for the prediction of the quality of a final product in the early development stages of the software development life cycle (SDLC). An ML prediction model uses software metrics and faulty data from previous projects to detect high-risk modules for future projects, so that the testing efforts can be targeted to those specific ‘risky' modules. Hence, ML-based predictors contribute to the detection of development anomalies early and inexpensively and ensure the timely delivery of a successful, failure-free and supreme quality software product within budget. This article has a comparison of 30 software quality prediction models (5 technique * 6 dataset) built on five ML techniques: artificial neural network (ANN); support vector machine (SVMs); Decision Tree (DTs); k-Nearest Neighbor (KNN); and Naïve Bayes Classifiers (NBC), using six datasets: CM1, KC1, KC2, PC1, JM1, and a combined one. These models exploit the predictive power of static code metrics, McCabe complexity metrics, for quality prediction. All thirty predictors are compared using a receiver operator curve (ROC), area under the curve (AUC), and accuracy as performance evaluation criteria. The results show that the ANN technique for software quality prediction is promising for accurate quality prediction irrespective of the dataset used.
APA, Harvard, Vancouver, ISO, and other styles
45

Yanjun Li, Yanjun Li, Huan Huang Yanjun Li, Qiang Geng Huan Huang, Xinwei Guo Qiang Geng, and Yuyu Yuan Xinwei Guo. "Fairness Measures of Machine Learning Models in Judicial Penalty Prediction." 網際網路技術學刊 23, no. 5 (September 2022): 1109–16. http://dx.doi.org/10.53106/160792642022092305019.

Full text
Abstract:
<p>Machine learning (ML) has been widely adopted in many software applications across domains. However, accompanying the outstanding performance, the behaviors of the ML models, which are essentially a kind of black-box software, could be unfair and hard to understand in many cases. In our human-centered society, an unfair decision could potentially damage human value, even causing severe social consequences, especially in decision-critical scenarios such as legal judgment. Although some existing works investigated the ML models in terms of robustness, accuracy, security, privacy, quality, etc., the study on the fairness of ML is still in the early stage. In this paper, we first proposed a set of fairness metrics for ML models from different perspectives. Based on this, we performed a comparative study on the fairness of existing widely used classic ML and deep learning models in the domain of real-world judicial judgments. The experiment results reveal that the current state-of-the-art ML models could still raise concerns for unfair decision-making. The ML models with high accuracy and fairness are urgently demanding.</p> <p>&nbsp;</p>
APA, Harvard, Vancouver, ISO, and other styles
46

Ali, Awad, Mohammed Bakri Bashir, Alzubair Hassan, Rafik Hamza, Samar M. Alqhtani, Tawfeeg Mohmmed Tawfeeg, and Adil Yousif. "Design-Time Reliability Prediction Model for Component-Based Software Systems." Sensors 22, no. 7 (April 6, 2022): 2812. http://dx.doi.org/10.3390/s22072812.

Full text
Abstract:
Software reliability is prioritised as the most critical quality attribute. Reliability prediction models participate in the prevention of software failures which can cause vital events and disastrous consequences in safety-critical applications or even in businesses. Predicting reliability during design allows software developers to avoid potential design problems, which can otherwise result in reconstructing an entire system when discovered at later stages of the software development life-cycle. Several reliability models have been built to predict reliability during software development. However, several issues still exist in these models. Current models suffer from a scalability issue referred to as the modeling of large systems. The scalability solutions usually come at a high computational cost, requiring solutions. Secondly, consideration of the nature of concurrent applications in reliability prediction is another issue. We propose a reliability prediction model that enhances scalability by introducing a system-level scenario synthesis mechanism that mitigates complexity. Additionally, the proposed model supports modeling of the nature of concurrent applications through adaption of formal statistical distribution toward scenario combination. The proposed model was evaluated using sensors-based case studies. The experimental results show the effectiveness of the proposed model from the view of computational cost reduction compared to similar models. This reduction is the main parameter for scalability enhancement. In addition, the presented work can enable system developers to know up to which load their system will be reliable via observation of the reliability value in several running scenarios.
APA, Harvard, Vancouver, ISO, and other styles
47

Almayyan, Waheeda. "Towards Predicting Software Defects with Clustering Techniques." International Journal of Artificial Intelligence & Applications 12, no. 1 (January 31, 2021): 39–54. http://dx.doi.org/10.5121/ijaia.2021.12103.

Full text
Abstract:
The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number of features.
APA, Harvard, Vancouver, ISO, and other styles
48

Yuan, Yuyu, Chenlong Li, and Jincui Yang. "An Improved Confounding Effect Model for Software Defect Prediction." Applied Sciences 13, no. 6 (March 8, 2023): 3459. http://dx.doi.org/10.3390/app13063459.

Full text
Abstract:
Software defect prediction technology can effectively improve software quality. Depending on the code metrics, machine learning models are built to predict potential defects. Some researchers have indicated that the size metric could cause confounding effects and bias the prediction results. However, evidence shows that the real confounder should be the development cycle and number of developers, which could bring confounding effects when using code metrics for prediction. This paper proposes an improved confounding effect model, introducing a new confounding variable into the traditional model. On multiple projects, we experimentally analyzed the effect extent of the confounding variable. Furthermore, we verified that controlling confounding variables helps improve the predictive model’s performance.
APA, Harvard, Vancouver, ISO, and other styles
49

Kakkar, Misha, Sarika Jain, Abhay Bansal, and P. S. Grover. "Nonlinear Geometric Framework for Software Defect Prediction." International Journal of Decision Support System Technology 12, no. 3 (July 2020): 85–100. http://dx.doi.org/10.4018/ijdsst.2020070105.

Full text
Abstract:
Humans use the software in every walk of life thus it is essential to have the best quality software. Software defect prediction models assist in identifying defect prone modules with the help of historical data, which in turn improves software quality. Historical data consists of data related to modules /files/classes which are labeled as buggy or clean. As the number of buggy artifacts as less as compared to clean artifacts, the nature of historical data becomes imbalance. Due to this uneven distribution of the data, it difficult for classification algorithms to build highly effective SDP models. The objective of this study is to propose a new nonlinear geometric framework based on SMOTE and ensemble learning to improve the performance of SDP models. The study combines the traditional SMOTE algorithm and the novel ensemble Support Vector Machine (SVM) is used to develop the proposed framework called SMEnsemble. SMOTE algorithm handles the class imbalance problem by generating synthetic instances of the minority class. Ensemble learning generates multiple classification models to select the best performing SDP model. For experimentation, datasets from three different software repositories that contain both open source as well as proprietary projects are used in the study. The results show that SMEnsemble performs better than traditional methods for identifying the minority class i.e. buggy artifacts. Also, the proposed model performance is better than the latest state of Art SDP model- SMOTUNED. The proposed model is capable of handling imbalance classes when compared with traditional methods. Also, by carefully selecting the number of ensembles high performance can be achieved in less time.
APA, Harvard, Vancouver, ISO, and other styles
50

Pan, Cong, Minyan Lu, and Biao Xu. "An Empirical Study on Software Defect Prediction Using CodeBERT Model." Applied Sciences 11, no. 11 (May 23, 2021): 4793. http://dx.doi.org/10.3390/app11114793.

Full text
Abstract:
Deep learning-based software defect prediction has been popular these days. Recently, the publishing of the CodeBERT model has made it possible to perform many software engineering tasks. We propose various CodeBERT models targeting software defect prediction, including CodeBERT-NT, CodeBERT-PS, CodeBERT-PK, and CodeBERT-PT. We perform empirical studies using such models in cross-version and cross-project software defect prediction to investigate if using a neural language model like CodeBERT could improve prediction performance. We also investigate the effects of different prediction patterns in software defect prediction using CodeBERT models. The empirical results are further discussed.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography