Auswahl der wissenschaftlichen Literatur zum Thema „Error log clustering“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Error log clustering" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Error log clustering"

1

Yelamanchili, Rama Krishna. „Modeling Stock Market Monthly Returns Volatility Using GARCH Models Under Different Distributions“. International Journal of Accounting & Finance Review 5, Nr. 1 (18.03.2020): 42–50. http://dx.doi.org/10.46281/ijafr.v5i1.425.

Der volle Inhalt der Quelle
Annotation:
This papers aims to uncover stylized facts of monthly stock market returns and identify adequate GARCH model with appropriate distribution density that captures conditional variance in monthly stock market returns. We obtain monthly close values of Bombay Stock Exchange’s (BSE) Sensex over the period January 1991 to December 2019 (348 monthly observations). To model the conditional variance, volatility clustering, asymmetry, and leverage effect we apply four conventional GARCH models under three different distribution densities. We use two information criterions to choose best fit model. Results reveal positive Skewness, weaker excess kurtosis, no autocorrelations in relative returns and log returns. On the other side presence of autocorrelation in squared log returns indicates volatility clustering. All the four GARCH models have better information criterion values under Gaussian distribution compared to t-distribution and Generalized Error Distribution. Furthermore, results indicate that conventional GARCH model is adequate to measure the conditional volatility. GJR-GARCH model under Gaussian distribution exhibit leverage effect but statistically not significant at any standard significance levels. Other asymmetric models do not exhibit leverage effect. Among the 12 models modeled in present paper, GARCH model has superior information criterion values, log likelihood value, and lowest standard error values for all the coefficients in the model.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Grigorieva, Maria, und Dmitry Grin. „Clustering error messages produced by distributed computing infrastructure during the processing of high energy physics data“. International Journal of Modern Physics A 36, Nr. 10 (10.04.2021): 2150070. http://dx.doi.org/10.1142/s0217751x21500706.

Der volle Inhalt der Quelle
Annotation:
Large-scale distributed computing infrastructures ensure the operation and maintenance of scientific experiments at the LHC: more than 160 computing centers all over the world execute tens of millions of computing jobs per day. ATLAS — the largest experiment at the LHC — creates an enormous flow of data which has to be recorded and analyzed by a complex heterogeneous and distributed computing environment. Statistically, about 10–12% of computing jobs end with a failure: network faults, service failures, authorization failures, and other error conditions trigger error messages which provide detailed information about the issue, which can be used for diagnosis and proactive fault handling. However, this analysis is complicated by the sheer scale of textual log data, and often exacerbated by the lack of a well-defined structure: human experts have to interpret the detected messages and create parsing rules manually, which is time-consuming and does not allow identifying previously unknown error conditions without further human intervention. This paper is dedicated to the description of a pipeline of methods for the unsupervised clustering of multi-source error messages. The pipeline is data-driven, based on machine learning algorithms, and executed fully automatically, allowing categorizing error messages according to textual patterns and meaning.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

He, Ruiquan, Haihua Hu, Chunru Xiong und Guojun Han. „Artificial Neural Network Assisted Error Correction for MLC NAND Flash Memory“. Micromachines 12, Nr. 8 (27.07.2021): 879. http://dx.doi.org/10.3390/mi12080879.

Der volle Inhalt der Quelle
Annotation:
The multilevel per cell technology and continued scaling down process technology significantly improves the storage density of NAND flash memory but also brings about a challenge in that data reliability degrades due to the serious noise. To ensure the data reliability, many noise mitigation technologies have been proposed. However, they only mitigate one of the noises of the NAND flash memory channel. In this paper, we consider all the main noises and present a novel neural network-assisted error correction (ANNAEC) scheme to increase the reliability of multi-level cell (MLC) NAND flash memory. To avoid using retention time as an input parameter of the neural network, we propose a relative log-likelihood ratio (LLR) to estimate the actual LLR. Then, we transform the bit detection into a clustering problem and propose to employ a neural network to learn the error characteristics of the NAND flash memory channel. Therefore, the trained neural network has optimized performances of bit error detection. Simulation results show that our proposed scheme can significantly improve the performance of the bit error detection and increase the endurance of NAND flash memory.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Xie, Shu-tong, Qiong Chen, Kun-hong Liu, Qing-zhao Kong und Xiu-juan Cao. „Learning Behavior Analysis Using Clustering and Evolutionary Error Correcting Output Code Algorithms in Small Private Online Courses“. Scientific Programming 2021 (14.06.2021): 1–11. http://dx.doi.org/10.1155/2021/9977977.

Der volle Inhalt der Quelle
Annotation:
In recent years, online and offline teaching activities have been combined by the Small Private Online Course (SPOC) teaching activities, which can achieve a better teaching result. Therefore, colleges around the world have widely carried out SPOC-based blending teaching. Particularly in this year’s epidemic, the online education platform has accumulated lots of education data. In this paper, we collected the student behavior log data during the blending teaching process of the “College Information Technology Fundamentals” course of three colleges to conduct student learning behavior analysis and learning outcome prediction. Firstly, data collection and preprocessing are carried out; cluster analysis is performed by using k-means algorithms. Four typical learning behavior patterns have been obtained from previous research, and these patterns were analyzed in terms of teaching videos, quizzes, and platform visits. Secondly, a multiclass classification framework, which combines a feature selection method based on genetic algorithm (GA) with the error correcting output code (ECOC) method, is designed for training the classification model to achieve the prediction of grade levels of students. The experimental results show that the multiclass classification method proposed in this paper can effectively predict the grade of performance, with an average accuracy rate of over 75%. The research results help to implement personalized teaching for students with different grades and learning patterns.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Lee, Jinhyung. „Factors Affecting Health Information Technology Expenditure in California Hospitals“. International Journal of Healthcare Information Systems and Informatics 10, Nr. 2 (April 2015): 1–13. http://dx.doi.org/10.4018/ijhisi.2015040101.

Der volle Inhalt der Quelle
Annotation:
This paper investigates the factors affecting health information technology (IT) investment. Different from previous studies, health IT was measured as the dollar amount of hardware, software and labor related health IT. This study employed Hospital and Patient level data of the Office of Statewide Health Planning and Development (OSHPD) from 2000 to 2006. The generalized linear model (GLM) was employed with log link and normal distribution and controlled for clustering error. This study found that not-for-profit and government hospital, teaching hospitals, competition, health IT expenditure of neighborhood hospitals were positively associated with health IT expenditure. However, rural hospitals were negatively associated with health IT expenditure. Moreover, this study found a significant increase in health IT investment over seven years resulted from increased clinical IT adoption.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Sharma, Abhishek, und Tarun Gulati. „Change Detection from Remotely Sensed Images Based on Stationary Wavelet Transform“. International Journal of Electrical and Computer Engineering (IJECE) 7, Nr. 6 (01.12.2017): 3395. http://dx.doi.org/10.11591/ijece.v7i6.pp3395-3401.

Der volle Inhalt der Quelle
Annotation:
The major issue of concern in change detection process is the accuracy of the algorithm to recover changed and unchanged pixels. The fusion rules presented in the existing methods could not integrate the features accurately which results in more number of false alarms and speckle noise in the output image. This paper proposes an algorithm which fuses two multi-temporal images through proposed set of fusion rules in stationary wavelet transform. In the first step, the source images obtained from log ratio and mean ratio operators are decomposed into three high frequency sub-bands and one low frequency sub-band by stationary wavelet transform. Then, proposed fusion rules for low and high frequency sub-bands are applied on the coefficient maps to get the fused wavelet coefficients map. The fused image is recovered by applying the inverse stationary wavelet transform (ISWT) on the fused coefficient map. Finally, the changed and unchanged areas are classified using Fuzzy c means clustering. The performance of the algorithm is calculated in terms of percentage correct classification (PCC), overall error (OE) and Kappa coefficient (K<sub>c</sub>). The qualitative and quantitative results prove that the proposed method offers least error, highest accuracy and Kappa value as compare to its preexistences.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

FERRO, A., G. PIGOLA, A. PULVIRENTI und D. SHASHA. „FAST CLUSTERING AND MINIMUM WEIGHT MATCHING ALGORITHMS FOR VERY LARGE MOBILE BACKBONE WIRELESS NETWORKS“. International Journal of Foundations of Computer Science 14, Nr. 02 (April 2003): 223–36. http://dx.doi.org/10.1142/s0129054103001698.

Der volle Inhalt der Quelle
Annotation:
Mobile Backbone Wireless Networks (MBWN) [10] are wireless networks in which the base stations are mobile. Our strategy is the following: mobile nodes are dynamically grouped into clusters of bounded radius. In the very large wireless networks we deal with we deal with, several hundreds of clusters may be generated. Clustering makes use of a two dimensional Euclidean version of the Antipole Tree data structure [5]. This very effective structure was originally designed for finite sets of points in an arbitrary metric space to support efficient range searching. It requires only a linear number of pair-wise distance calculations among nodes. Mobile Base Stations occupy an approximate centroid of the clusters and are moved according to a fast practical bipartite matching algorithm which tries to minimize both total and maximum distance. We show that the best known computational geometry algorithms [1] become infeasible for our application when a high number of mobile base stations is required. On the other hand our proposed 8% average error solution requires O (k log k) running time instead of the approximatively O (k2) exact algorithm [1]. Communication among nodes is realized by a Clusterhead Gateway Switching Routing (CGSR) protocol [15] where the mobile base stations are organized in a suitable network. Other efficient clustering algorithms [11, 17] may be used instead of the Antipole Tree. However the nice hierarchical structure of the Antipole Tree makes it applicable to other types of mobile wireless (Ad-Hoc) and wired networks but this will be subject of future work.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Yu, Yanxiang, Chicheng Xu, Siddharth Misra, Weichang Li, Michael Ashby, Wen Pan, Tianqi Deng et al. „Synthetic Sonic Log Generation With Machine Learning: A Contest Summary From Five Methods“. Petrophysics – The SPWLA Journal of Formation Evaluation and Reservoir Description 62, Nr. 4 (01.08.2021): 393–406. http://dx.doi.org/10.30632/pjv62n4-2021a4.

Der volle Inhalt der Quelle
Annotation:
Compressional and shear sonic traveltime logs (DTC and DTS, respectively) are crucial for subsurface characterization and seismic-well tie. However, these two logs are often missing or incomplete in many oil and gas wells. Therefore, many petrophysical and geophysical workflows include sonic log synthetization or pseudo-log generation based on multivariate regression or rock physics relations. Started on March 1, 2020, and concluded on May 7, 2020, the SPWLA PDDA SIG hosted a contest aiming to predict the DTC and DTS logs from seven “easy-to-acquire” conventional logs using machine-learning methods (GitHub, 2020). In the contest, a total number of 20,525 data points with half-foot resolution from three wells was collected to train regression models using machine-learning techniques. Each data point had seven features, consisting of the conventional “easy-to-acquire” logs: caliper, neutron porosity, gamma ray (GR), deep resistivity, medium resistivity, photoelectric factor, and bulk density, respectively, as well as two sonic logs (DTC and DTS) as the target. The separate data set of 11,089 samples from a fourth well was then used as the blind test data set. The prediction performance of the model was evaluated using root mean square error (RMSE) as the metric, shown in the equation below: RMSE=sqrt(1/2*1/m* [∑_(i=1)^m▒〖(〖DTC〗_pred^i-〖DTC〗_true^i)〗^2 + 〖(〖DTS〗_pred^i-〖DTS〗_true^i)〗^2 ] In the benchmark model, (Yu et al., 2020), we used a Random Forest regressor and conducted minimal preprocessing to the training data set; an RMSE score of 17.93 was achieved on the test data set. The top five models from the contest, on average, beat the performance of our benchmark model by 27% in the RMSE score. In the paper, we will review these five solutions, including preprocess techniques and different machine-learning models, including neural network, long short-term memory (LSTM), and ensemble trees. We found that data cleaning and clustering were critical for improving the performance in all models.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

CARABIN, H., S. T. McGARVEY, I. SAHLU, M. R. TARAFDER, L. JOSEPH, B. B. DE ANDRADE, E. BALOLONG und R. OLVEDA. „Schistosoma japonicum in Samar, the Philippines: infection in dogs and rats as a possible risk factor for human infection“. Epidemiology and Infection 143, Nr. 8 (02.10.2014): 1767–76. http://dx.doi.org/10.1017/s0950268814002581.

Der volle Inhalt der Quelle
Annotation:
SUMMARYThe role that animals play in the transmission of Schistosoma japonicum to humans in the Philippines remains uncertain and prior studies have not included several species, adjustment for misclassification error and clustering, or used a cohort design. A cohort study of 2468 people providing stool samples at 12 months following praziquantel treatment in 50 villages of Western Samar, the Philippines, was conducted. Stool samples from dogs, cats, rats, and water buffaloes were collected at baseline (2003–2004) and follow-up (2005). Latent-class hierarchical Bayesian log-binomial models adjusting for misclassification errors in diagnostic tests were used. The village-level baseline and follow-up prevalences of cat, dog, and rat S. japonicum infection were associated with the 12-month cumulative incidence of human S. japonicum infection, with similar magnitude and precision of effect, but correlation between infection levels made it difficult to divide their respective effects. The cumulative incidence ratios associated with a 1% increase in the prevalence of infection in dogs at baseline and in rats at follow-up were 1·04 [95% Bayesian credible interval (BCI) 1·02–1·07] and 1·02 (95% BCI 1·01–1·04), respectively, when both species were entered in the model. Dogs appear to play a role in human schistosomiasis infection while rats could be used as schistosomiasis sentinels.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Wei, Chunzhu, Qianying Zhao, Yang Lu und Dongjie Fu. „Assessment of Empirical Algorithms for Shallow Water Bathymetry Using Multi-Spectral Imagery of Pearl River Delta Coast, China“. Remote Sensing 13, Nr. 16 (06.08.2021): 3123. http://dx.doi.org/10.3390/rs13163123.

Der volle Inhalt der Quelle
Annotation:
Pearl River Delta (PRD), as one of the most densely populated regions in the world, is facing both natural changes (e.g., sea level rise) and human-induced changes (e.g., dredging for navigation and land reclamation). Bathymetric information is thus important for the protection and management of the estuarine environment, but little effort has been made to comprehensively evaluate the performance of different methods and datasets. In this study, two linear regression models—the linear band model and the log-transformed band ratio model, and two non-linear regression models—the support vector regression model and the random forest regression model—were applied to Landsat 8 (L8) and Sentinel-2 (S2) imagery for bathymetry mapping in 2019 and 2020. Results suggested that a priori area clustering based on spectral features using the K-means algorithm improved estimation accuracy. The random forest regression model performed best, and the three-band combinations outperformed two-band combinations in all models. When the non-linear models were applied with three-band combination (red, green, blue) to L8 and S2 imagery, the Root Mean Square Error (Mean Absolute Error) decreased by 23.10% (35.53%), and the coefficient of determination (Kling-Gupta efficiency) increased by 0.08 (0.09) on average, compared to those using the linear regression models. Despite the differences in spatial resolution and band wavelength, L8 and S2 performed similarly in bathymetry estimation. This study quantified the relative performance of different models and may shed light on the potential combination of multiple data sources for more timely and accurate bathymetry mapping.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Error log clustering"

1

Bjurenfalk, Jonatan, und August Johnson. „Automated error matching system using machine learning and data clustering : Evaluating unsupervised learning methods for categorizing error types, capturing bugs, and detecting outliers“. Thesis, Linköpings universitet, Programvara och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177280.

Der volle Inhalt der Quelle
Annotation:
For large and complex software systems, it is a time-consuming process to manually inspect error logs produced from the test suites of such systems. Whether it is for identifyingabnormal faults, or finding bugs; it is a process that limits development progress, and requires experience. An automated solution for such processes could potentially lead to efficient fault identification and bug reporting, while also enabling developers to spend more time on improving system functionality. Three unsupervised clustering algorithms are evaluated for the task, HDBSCAN, DBSCAN, and X-Means. In addition, HDBSCAN, DBSCAN and an LSTM-based autoencoder are evaluated for outlier detection. The dataset consists of error logs produced from a robotic test system. These logs are cleaned and pre-processed using stopword removal, stemming, term frequency-inverse document frequency (tf-idf) and singular value decomposition (SVD). Two domain experts are tasked with evaluating the results produced from clustering and outlier detection. Results indicate that X-Means outperform the other clustering algorithms when tasked with automatically categorizing error types, and capturing bugs. Furthermore, none of the outlier detection methods yielded sufficient results. However, it was found that X-Means’s clusters with a size of one data point yielded an accurate representation of outliers occurring in the error log dataset. Conclusively, the domain experts deemed X-means to be a helpful tool for categorizing error types, capturing bugs, and detecting outliers.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Joseph, Binoy. „Clustering For Designing Error Correcting Codes“. Thesis, Indian Institute of Science, 1994. http://hdl.handle.net/2005/66.

Der volle Inhalt der Quelle
Annotation:
In this thesis we address the problem of designing codes for specific applications. To do so we make use of the relationship between clusters and codes. Designing a block code over any finite dimensional space may be thought of as forming the corresponding number of clusters over the particular dimensional space. In literature we have a number of algorithms available for clustering. We have examined the performance of a number of such algorithms, such as Linde-Buzo-Gray, Simulated Annealing, Simulated Annealing with Linde-Buzo-Gray, Deterministic Annealing, etc, for design of codes. But all these algorithms make use of the Eucledian squared error distance measure for clustering. This distance measure does not match with the distance measure of interest in the error correcting scenario, namely, Hamming distance. Consequently we have developed an algorithm that can be used for clustering with Hamming distance as the distance measure. Also, it has been observed that stochastic algorithms, such as Simulated Annealing fail to produce optimum codes due to very slow convergence near the end. As a remedy, we have proposed a modification based on the code structure, for such algorithms for code design which makes it possible to converge to the optimum codes.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Error log clustering"

1

Shang, Yingying. „LAR: A User Behavior Prediction Model in Server Log Based on LSTM-Attention Network and RSC Algorithm“. In Fuzzy Systems and Data Mining VI. IOS Press, 2020. http://dx.doi.org/10.3233/faia200709.

Der volle Inhalt der Quelle
Annotation:
Using server log data to predict the URLs that a user is likely to visit is an important research area in user behavior prediction. In this paper, a predictive model (called LAR) based on the long short-term memory (LSTM) attention network and reciprocal-nearest-neighbors supported clustering algorithm (RSC) for predicting the URL is proposed. First, the LSTM-attention network is used to predict the URL categories a user might visit, and the RSC algorithm is then used to cluster users. Subsequently, the URLs belonging to the same category are determined from the user clusters to predict the URLs that the user might visit. The proposed LAR model considers the time sequence of the user access URL, and the relationship between a single user and group users, which effectively improves the prediction accuracy. The experimental results demonstrate that the LAR model is feasible and effective for user behavior prediction. The accuracy of the mean absolute error and root mean square error of the LAR model are better than those of the other models compared in this study.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Lee, Jinhyung. „Factors Affecting Health Information Technology Expenditure in California Hospitals“. In Technology Adoption and Social Issues, 1437–49. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5201-7.ch066.

Der volle Inhalt der Quelle
Annotation:
This paper investigates the factors affecting health information technology (IT) investment. Different from previous studies, health IT was measured as the dollar amount of hardware, software and labor related health IT. This study employed Hospital and Patient level data of the Office of Statewide Health Planning and Development (OSHPD) from 2000 to 2006. The generalized linear model (GLM) was employed with log link and normal distribution and controlled for clustering error. This study found that not-for-profit and government hospital, teaching hospitals, competition, health IT expenditure of neighborhood hospitals were positively associated with health IT expenditure. However, rural hospitals were negatively associated with health IT expenditure. Moreover, this study found a significant increase in health IT investment over seven years resulted from increased clinical IT adoption.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Lee, Jinhyung, und Hansil Choi. „Health Information Technology Spending on the Rise“. In Advances in Healthcare Information Systems and Administration, 1–14. IGI Global, 2018. http://dx.doi.org/10.4018/978-1-5225-5460-8.ch001.

Der volle Inhalt der Quelle
Annotation:
In this chapter, the authors track health information technology by examining the factors affecting health information technology (IT) expenditure. The authors employed hospital- and patient-level data of the Office of Statewide Health Planning and Development (OSHPD) from 2000 to 2006. The generalized linear model (GLM) was employed with log link and normal distribution and controlled for clustering error. The authors found that not-for-profit and government hospitals, teaching hospitals, competition, and health IT expenditure of neighborhood hospitals were positively associated with health IT expenditure. However, rural hospitals were negatively associated with health IT expenditure. Moreover, the authors found that mean annual health IT expenditure was approximately $7.4 million from 2000-2006. However, it jumped 204% to $15.1 million from 2008-2014.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Abdalla, Abubakr Gafar, Tarig Mohamed Ahmed und Mohamed Elhassan Seliaman. „Web Usage Mining and the Challenge of Big Data“. In Big Data, 899–928. IGI Global, 2016. http://dx.doi.org/10.4018/978-1-4666-9840-6.ch042.

Der volle Inhalt der Quelle
Annotation:
The web is a rich data mining source which is dynamic and fast growing, providing great opportunities which are often not exploited. Web data represent a real challenge to traditional data mining techniques due to its huge amount and the unstructured nature. Web logs contain information about the interactions between visitors and the website. Analyzing these logs provides insights into visitors' behavior, usage patterns, and trends. Web usage mining, also known as web log mining, is the process of applying data mining techniques to discover useful information hidden in web server's logs. Web logs are primarily used by Web administrators to know how much traffic they get and to detect broken links and other types of errors. Web usage mining extracts useful information that can be beneficial to a number of application areas such as: web personalization, website restructuring, system performance improvement, and business intelligence. The Web usage mining process involves three main phases: pre-processing, pattern discovery, and pattern analysis. Various preprocessing techniques have been proposed to extract information from log files and group primitive data items into meaningful, lighter level abstractions that are suitable for mining, usually in forms of visitors' sessions. Major data mining techniques in web usage mining pattern discovery are: clustering, association analysis, classification, and sequential patterns discovery. This chapter discusses the process of web usage mining, its procedure, methods, and patterns discovery techniques. The chapter also presents a practical example using real web log data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Abdalla, Abubakr Gafar, Tarig Mohamed Ahmed und Mohamed Elhassan Seliaman. „Web Usage Mining and the Challenge of Big Data“. In Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, 418–47. IGI Global, 2015. http://dx.doi.org/10.4018/978-1-4666-8505-5.ch020.

Der volle Inhalt der Quelle
Annotation:
The web is a rich data mining source which is dynamic and fast growing, providing great opportunities which are often not exploited. Web data represent a real challenge to traditional data mining techniques due to its huge amount and the unstructured nature. Web logs contain information about the interactions between visitors and the website. Analyzing these logs provides insights into visitors' behavior, usage patterns, and trends. Web usage mining, also known as web log mining, is the process of applying data mining techniques to discover useful information hidden in web server's logs. Web logs are primarily used by Web administrators to know how much traffic they get and to detect broken links and other types of errors. Web usage mining extracts useful information that can be beneficial to a number of application areas such as: web personalization, website restructuring, system performance improvement, and business intelligence. The Web usage mining process involves three main phases: pre-processing, pattern discovery, and pattern analysis. Various preprocessing techniques have been proposed to extract information from log files and group primitive data items into meaningful, lighter level abstractions that are suitable for mining, usually in forms of visitors' sessions. Major data mining techniques in web usage mining pattern discovery are: clustering, association analysis, classification, and sequential patterns discovery. This chapter discusses the process of web usage mining, its procedure, methods, and patterns discovery techniques. The chapter also presents a practical example using real web log data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

S., Sankar Ganesh, Mohanaprasad K., Arunprakash Jayaprakash und Sivanantham Sathasivam. „Optimized-Fuzzy-Logic-Based Bit Loading Algorithms“. In Handbook of Research on Fuzzy and Rough Set Theory in Organizational Decision Making, 305–15. IGI Global, 2017. http://dx.doi.org/10.4018/978-1-5225-1008-6.ch013.

Der volle Inhalt der Quelle
Annotation:
Next generation wireless communication systems promise the subscribers with Giga-bit-data-rate experience at low Bit Error Rate (BER) under adverse channel conditions. In order to maximize the overall system throughput of Orthogonal Frequency Division Multiplexing (OFDM), adaptive modulation is one of the key solutions. In adaptive modulated OFDM, the subcarriers are allocated with data bits and energy in accordance with the Signal to Interference Ratio (SIR) of the multipath channel, which is referred to as adaptive bit loading and adaptive power allocation respectively. The number of iterations required allocating the target bits and energy to a sub channel is optimized. The key choice of the paper is to allocate the bits with minimum number of iterations after clustering the sub channels using fuzzy logic. The proposed method exhibits a faster convergence in obtaining the optimal solution.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Mehra, Jayanti, und Ramjeevan Singh Thakur. „A Model for Extracting Most Desired Web Pages“. In Transforming Businesses With Bitcoin Mining and Blockchain Applications, 119–45. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-0186-3.ch007.

Der volle Inhalt der Quelle
Annotation:
Weblog analysis takes raw data from access logs and performs study on this data for extracting statistical information. This info incorporates a variety of data for the website activity such as average no. of hits, total no. of user visits, failed and successful cached hits, average time of view, average path length over a website; analytical information such as page was not found errors and server errors; server information, which includes exit and entry pages, single access pages, and top visited pages; requester information like which type of search engines is used, keywords and top referring sites, and so on. In general, the website administrator uses this kind of knowledge to make the system act better, helping in the manipulation process of site, then also forgiving marketing decisions support. Most of the advanced web mining systems practice this kind of information to take out more difficult or complex interpretations using data mining procedures like association rules, clustering, and classification.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Error log clustering"

1

Ashayeri, Cyrus, und Donald L. Paul. „A Stochastic Method in Investigating Basin-Wide Underlying Distribution Functions of Decline Rate Behavior for Unconventional Resources“. In SPE Western Regional Meeting. SPE, 2021. http://dx.doi.org/10.2118/200879-ms.

Der volle Inhalt der Quelle
Annotation:
Abstract Basin-wide heterogeneity of production in unconventional resources creates additional risk in field development planning. In the past few years, several data-driven models have been developed to increase the accuracy in predicting the recovery from shale gas and tight oil wells. However, many of the machine learning methods with so called "black box" approach provide deterministic results. Therefore, understanding the uncertainty associated with different development scenarios would be difficult to obtain. We have investigated the underlying statistical distribution functions that govern the production rates and decline behavior of unconventional wells. Identification and quantification of these distribution functions provide a strong tool to accurately forecast the cumulative production of a large group of wells in an unconventional basin. By understanding the relationship among geologic characteristics of different sections of the asset, and the impact of varying drilling and completion parameters, capital allocation can be done in a more efficient manner. In this paper, we have identified the statistical distribution parameters of decline behavior is a Power Law model. In doing so, we have used unsupervised clustering techniques to find an optimal number of clusters that enable observing well behaved and identifiable underlying distribution functions. Furthermore, we quantified different types of distribution functions in a trial and error workflow to provide a tool for accurately evaluating the impact of varying geologic parameters on the decline behavior of these wells. Our results show that the leading term (or leading coefficient), which also highly correlates with long term cumulative recovery, demonstrates Gamma distribution, while the power degree (or power coefficient) demonstrate Normal distribution. Peak production rate (maximum average daily rate), terminal rate (rate after switch point), and the time of terminal rate occurrence, all demonstrate Log Normal distribution.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Alfonso, L., F. Caleyo, J. M. Hallen und J. Araujo. „On the Applicability of Extreme Value Statistics in the Prediction of Maximum Pit Depth in Heavily Corroded Non-Piggable Buried Pipelines“. In 2010 8th International Pipeline Conference. ASMEDC, 2010. http://dx.doi.org/10.1115/ipc2010-31321.

Der volle Inhalt der Quelle
Annotation:
There exists a large number of works aimed at the application of Extreme Value Statistics to corrosion. However, there is a lack of studies devoted to the applicability of the Gumbel method to the prediction of maximum pitting-corrosion depth. This is especially true for works considering the typical pit densities and spatial patterns in long, underground pipelines. In the presence of spatial pit clustering, estimations could deteriorate, raising the need to increase the total inspection area in order to obtain the desired accuracy for the estimated maximum pit depth. In most practical situations, pit-depth samples collected along a pipeline belong to distinguishable groups, due to differences in corrosion environments. For example, it is quite probable that samples collected from the pipeline’s upper and lower external surfaces will differ and represent different pit populations. In that case, maximum pit-depth estimations should be made separately for these two quite different populations. Therefore, a good strategy to improve maximum pit-depth estimations is critically dependent upon a careful selection of the inspection area used for the extreme value analysis. The goal should be to obtain sampling sections that contain a pit population as homogenous as possible with regard to corrosion conditions. In this study, the aforementioned strategy is carefully tested by comparing extreme-value-oriented Monte Carlo simulations of maximum pit depth with the results of inline inspections. It was found that the variance to mean ratio, a measure of randomness, and the mean squared error of the maximum pit-depth estimations were considerably reduced, compared with the errors obtained for the entire pipeline area, when the inspection areas were selected based on corrosion-condition homogeneity.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Wang, Weizhi, Csaba Pakozdi, Arun Kamath und Hans Bihs. „High-Fidelity Representation of Three-Hour Offshore Short-Crested Wave Field in the Fully Nonlinear Potential Flow Model REEF3D::FNPF“. In ASME 2020 39th International Conference on Ocean, Offshore and Arctic Engineering. American Society of Mechanical Engineers, 2020. http://dx.doi.org/10.1115/omae2020-18262.

Der volle Inhalt der Quelle
Annotation:
Abstract Stochastic wave properties are crucial for the design of offshore structures. Short-crested seas are commonly seen at the sites of offshore structures, especially during storm events. A long time duration is required in order to obtain the statistical properties, which is challenging for numerical simulations because of the high demand of computational resources. In this scenario, a potential flow solver is ideal due to its computational efficiency. A procedure of producing accurate representation of short-crested sea states using the open-source fully nonlinear potential flow model REEF3D::FNPF is presented in the paper. The procedure examines the sensitivity of the resolutions in space and time as well as the arrangements of wave gauge arrays. A narrow band power spectrum and a mildly spreading directional spreading function are simulated, and an equal energy method is used to generate input waves to avoid phase-locking. REEF3D::FNPF solves the Laplace equation together with the boundary conditions using a finite difference method. A sigma grid is used in the vertical direction and the vertical grid clustering follows the principle of constant truncation error. High-order discretisation methods are implemented in space and time. Message passing interface is used for high performance computation using multiple processors. Three-hour simulations are performed in full-scale at a hypothetic offshore site with constant water depth. The significant wave height, peak period, kurtosis, skewness and ergodicity are examined in the numerically generated wave field. The stochastic wave properties in the numerical wave tank (NWT) using REEF3D::FNPF match the input wave conditions with high fidelity.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Ramcharitar, Kamlesh, und Arti Kandice Ramdhanie. „Using Machine Learning Methods to Identify Reservoir Compartmentalization in Mature Oilfields from Legacy Production Data“. In SPE Trinidad and Tobago Section Energy Resources Conference. SPE, 2021. http://dx.doi.org/10.2118/200979-ms.

Der volle Inhalt der Quelle
Annotation:
Abstract Despite long production histories, operators of mature oilfields sometimes struggle to account for reservoir compartmentalization. Geological-led workflows do not adequately honor legacy production data since inherent bias is introduced into the process of allocating production by interpreted flow units. This paper details the application of machine learning methods to identify possible reservoir compartments based on legacy production data recorded from individual well completions. We propose an experimental data-driven workflow to rapidly generate multiple scenarios of connected volumes in the subsurface. The workflow is premised upon the logic that well completions draining the same connected reservoir space can exhibit similar production characteristics (rate declines, GOR trends and pressures). We show how the specific challenges of digitized legacy data are solved using outlier detection for error checking and Kalman smoothing imputation for missing data in the structural time series model. Finally, we compare the subsurface grouping of completions obtained by applying unsupervised pattern recognition with Hierarchal clustering. Application of this workflow results in multiple possible scenarios for defining reservoir compartments based on production data trends only. The method is powerful in that, it provides interpretations that are independent of subsurface scenarios generated by more traditional workflows. We demonstrate the potential to integrate interpretations generated from more conventional workflows to increase the robustness of the overall subsurface model. We have leveraged the power of machine learning methods to classify more than forty (40) well completions into discrete reservoir compartments using production characteristics only. This effort would be extremely difficult, or otherwise unreliable given the inherent limitations of human spatial, temporal, and cognitive abilities.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie