Log in

Relevant bibliographies by topics / Web usage data mining techniques / Journal articles

Journal articles on the topic 'Web usage data mining techniques'

To see the other types of publications on this topic, follow the link: Web usage data mining techniques.

Author: Grafiati

Published: 10 December 2022

Last updated: 28 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Web usage data mining techniques.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Patel, Ketul, and Dr A. R. Patel. "Process of Web Usage Mining to find Interesting Patterns from Web Usage Data." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 3, no. 1 (August 1, 2012): 144–48. http://dx.doi.org/10.24297/ijct.v3i1c.2767.

Full text

Abstract:

The traffic on World Wide Web is increasing rapidly and huge amount of data is generated due to usersâ€™ numerous interactions with web sites. Web Usage Mining is the application of data mining techniques to discover the useful and interesting patterns from web usage data. It supports to know frequently accessed pages, predict user navigation, improve web site structure etc. In order to apply Web Usage Mining, various steps are performed. This paper discusses the process of Web Usage Mining consisting steps: Data Collection, Pre-processing, Pattern Discovery and Pattern Analysis. It has also presented Web Usage Mining applications and some Web Mining software.

APA, Harvard, Vancouver, ISO, and other styles

2

Harika, B., and T. Sudha. "Extraction of Knowledge from Web Server Logs Using Web Usage Mining." Asian Journal of Computer Science and Technology 8, S3 (June 5, 2019): 12–15. http://dx.doi.org/10.51983/ajcst-2019.8.s3.2113.

Full text

Abstract:

Information on internet increases rapidly from day to day and the usage of the web also increases, thus there is the need to discover interesting patterns from web. The process used to extract and mine useful information from web documents by using Data Mining Techniques is called Web Mining. Web Mining is broadly classified in to three types namely Web Content Mining, Web Structure Mining and Web Usage Mining. In this paper our focus is mainly on Web Usage Mining, where we are applying the data mining techniques to analyse and discover interesting knowledge from the Web Usage data. The activities of the user are captured and stored at different levels such as server level, proxy level and user level called as Web Usage Data and the usage data stored at server side is Web Server Log, where it records the browsing behavior of users and their requests based on the user clicks. Web server Log is a primary source to perform Web Usage Mining. This paper also brings in to discussion of various existing pre-processing techniques and analysis of web log files and how clustering is applied to group the users based on the browsing behavior of users on their interested contents.

APA, Harvard, Vancouver, ISO, and other styles

3

V, Sathiyamoorthi, and Murali Bhaskaran .V. "DATA PREPARATION TECHNIQUES FOR WEB USAGE MINING IN WORLD WIDE WEB." International Journal on Information Sciences and Computing 4, no. 1 (2010): 55–60. http://dx.doi.org/10.18000/ijisac.50067.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Yau, Ng Qi, and Wan Zainon. "UNDERSTANDING WEB TRAFFIC ACTIVITIES USING WEB MINING TECHNIQUES." International Journal of Engineering Technologies and Management Research 4, no. 9 (February 1, 2020): 18–26. http://dx.doi.org/10.29121/ijetmr.v4.i9.2017.96.

Full text

Abstract:

Web Usage Mining is a computational process of discovering patterns in large data sets involving methods using the artificial intelligence, machine learning, statistical analysis and database systems with the goal to extract valuable information from accessing server logs of World Wide Web data repositories and transform it into an understandable structure for further understanding and use. Main focus of this paper will be centered on exploring methods that expedites the log mining process and present the result of log mining process through data visualization and compare data-mining algorithms. For the comparison between classification techniques, precision, recall and ROC area are the correct measures that are used to compare algorithms. Based on this study it shows that Naïve Bayes and Bayes Network are proven to be the best algorithms for that.

APA, Harvard, Vancouver, ISO, and other styles

5

HOGO, MOFREH, MIROSLAV SNOREK, and PAWAN LINGRAS. "TEMPORAL VERSUS LATEST SNAPSHOT WEB USAGE MINING USING KOHONEN SOM AND MODIFIED KOHONEN SOM BASED ON THE PROPERTIES OF ROUGH SETS THEORY." International Journal on Artificial Intelligence Tools 13, no. 03 (September 2004): 569–91. http://dx.doi.org/10.1142/s0218213004001697.

Full text

Abstract:

Temporal Web usage mining involves application of data mining techniques on temporal Web usage data to discover temporal usage patterns, which describe the temporal behavior of users on the Internet Web site, to understand the temporal users' behavior during different time slices. Clustering and classification are two important functions in Web mining. Classes, and associations in Web mining do not necessarily have crisp boundaries. Therefore the conventional clustering techniques became unsuitable to find such clusters and associations, where these conventional classification algorithms provide crisp classes, which are not suitable in real world applications. This gives the chance of using the non-conventional clustering techniques as fuzzy and rough sets in Web mining clustering applications. Recent research introduced the adaptation of Kohonen SOM based on the properties of rough sets theory to find the interval set clusters for the users on the Internet. This paper introduces the comparison between the latest snapshot Web usage mining and the temporal Web usage mining, and. the comparison between the temporal Web usage mining using the conventional Kohonen SOM and the modified Kohonen SOM based on the properties of sets theory.

APA, Harvard, Vancouver, ISO, and other styles

6

Ezzikouri, Hanane, Mohamed Fakir, Cherki Daoui, and Mohamed Erritali. "Extracting Knowledge from Web Data." Journal of Information Technology Research 7, no. 4 (October 2014): 27–41. http://dx.doi.org/10.4018/jitr.2014100103.

Full text

Abstract:

The user behavior on a website triggers a sequence of queries that have a result which is the display of certain pages. The Information about these queries (including the names of the resources requested and responses from the Web server) are stored in a text file called a log file. Analysis of server log file can provide significant and useful information. Web Mining is the extraction of interesting and potentially useful patterns and implicit information from artifacts or activity related to the World Wide Web. Web usage mining is a main research area in Web mining focused on learning about Web users and their interactions with Web sites. The motive of mining is to find users' access models automatically and quickly from the vast Web log file, such as frequent access paths, frequent access page groups and user clustering. Through Web Usage Mining, several information left by user access can be mined which will provide foundation for decision making of organizations, Also the process of Web mining was defined as the set of techniques designed to explore, process and analyze large masses of consecutive information activities on the Internet, has three main steps: data preprocessing, extraction of reasons of the use and the interpretation of results. This paper will start with the presentation of different formats of web log files, then it will present the different preprocessing method that have been used, and finally it presents a system for “Web content and Usage Mining'' for web data extraction and web site analysis using Data Mining Algorithms Apriori, FPGrowth, K-Means, KNN, and ID3.

APA, Harvard, Vancouver, ISO, and other styles

7

DE, SUPRIYA KUMAR, and P. RADHA KRISHNA. "MINING WEB DATA USING CLUSTERING TECHNIQUE FOR WEB PERSONALIZATION." International Journal of Computational Intelligence and Applications 02, no. 03 (September 2002): 255–65. http://dx.doi.org/10.1142/s1469026802000580.

Full text

Abstract:

Clustering of data in a large dimension space is of great interest in many data mining applications. In this paper, we propose a method for clustering of web usage data in a high-dimensional space based on a concept hierarchy model. In this method, the relationship present in the web usage data are mapped into a fuzzy proximity relation of user transactions. We also described an approach to present the preference set of URLs to a new user transaction based on the match score with the clusters. The study demonstrates that our approach is general and effective for mining the web data for web personalization.

APA, Harvard, Vancouver, ISO, and other styles

8

Ramanathaiah, Ramakrishnan M., Bhawna Nigam, and M. Niranjanamurthy. "Construction of User’s Navigation Sessions from Web Logs for Web Usage Mining." Journal of Computational and Theoretical Nanoscience 17, no. 9 (July 1, 2020): 4432–37. http://dx.doi.org/10.1166/jctn.2020.9091.

Full text

Abstract:

Web Usage Mining applies fewer techniques in record data to pull out the behavior of users. The knowledge mined from the web log can be utilized in web personalization, Prediction, prefetching, restructuring of web sites etc. It consists of three steps in preprocessing, pattern detection and analysis. Web log information is typically noisy and uncertain and preprocessing is a significant process ahead of mining. The Patterns discovered after applying the mining techniques are dependent on the accuracy of the weblog which in turn depends on the preprocessing phase. The output of preprocessing should be the user’s navigation session file. In this paper the techniques of preprocessing and the method for construction of user’s navigation session file is proposed.

APA, Harvard, Vancouver, ISO, and other styles

9

Hamodi, Yaser Issam, Ruaa Riyadh Hussein, and Naeem Th Yousir. "Development of a Unifying Theory for Data Mining Using Clustering Techniques." Webology 17, no. 2 (December 21, 2020): 01–14. http://dx.doi.org/10.14704/web/v17i2/web17012.

Full text

Abstract:

A performance evaluation of four different clustering techniques was carried out based on segmenting consumer by product type and by product usage in the research. Cobweb, DBSCAN, EM and k-means algorithms were evaluated based on the computational time, accuracy of the result produced and the purity of the result produced. The experiment was performed using WEKA as a data mining tool. The performance evaluation of the four techniques showed that K-means outperformed others in all considered evaluation measure while the EM technique was the second best in terms of accuracy and purity, outperforming the other two. DBSCAN technique was the 3rd best of the selected algorithms even as its computational time is shorter than that of EM while the fourth best performing calculation has been believed to be the Spider web calculation as respects to immaculateness, exactness and computational time.

APA, Harvard, Vancouver, ISO, and other styles

10

Abraham, Ajith. "Business Intelligence from Web Usage Mining." Journal of Information & Knowledge Management 02, no. 04 (December 2003): 375–90. http://dx.doi.org/10.1142/s0219649203000565.

Full text

Abstract:

The rapid e-commerce growth has made both business community and customers face a new situation. Due to intense competition on the one hand and the customer's option to choose from several alternatives, the business community has realized the necessity of intelligent marketing strategies and relationship management. Web usage mining attempts to discover useful knowledge from the secondary data obtained from the interactions of the users with the Web. Web usage mining has become very critical for effective Web site management, creating adaptive Web sites, business and support services, personalization, network traffic flow analysis and so on. This paper presents the important concepts of Web usage mining and its various practical applications. Further a novel approach called "intelligent-miner" (i-Miner) is presented. i-Miner could optimize the concurrent architecture of a fuzzy clustering algorithm (to discover web data clusters) and a fuzzy inference system to analyze the Web site visitor trends. A hybrid evolutionary fuzzy clustering algorithm is proposed to optimally segregate similar user interests. The clustered data is then used to analyze the trends using a Takagi-Sugeno fuzzy inference system learned using a combination of evolutionary algorithm and neural network learning. Proposed approach is compared with self-organizing maps (to discover patterns) and several function approximation techniques like neural networks, linear genetic programming and Takagi–Sugeno fuzzy inference system (to analyze the clusters). The results are graphically illustrated and the practical significance is discussed in detail. Empirical results clearly show that the proposed Web usage-mining framework is efficient.

APA, Harvard, Vancouver, ISO, and other styles

11

Anis, M. Z., A. Mukherjee, and B. K. Roy. "Application of Data Mining Techniques for Proper Design of Knowledge Repository." Journal of Information & Knowledge Management 06, no. 03 (September 2007): 201–9. http://dx.doi.org/10.1142/s0219649207001767.

Full text

Abstract:

This paper reports a basic assessment of the usage of a knowledge repository called "XYZ Knowledge Repository", maintained by a company PQR for its client XYZ. Subsequently, a data mining method for finding sequential patterns in the web usage of the users was used. This captures the level of association between the web pages through two parameters, support and confidence. Depending upon the values of these two parameters, concepts of strong association, weak association and no association are realised. In case of strong association we recommended the pages be grouped together; while in the case of weak association, a link may be provided.

APA, Harvard, Vancouver, ISO, and other styles

12

ZHANG, QINGYU, and RICHARD S. SEGALL. "WEB MINING: A SURVEY OF CURRENT RESEARCH, TECHNIQUES, AND SOFTWARE." International Journal of Information Technology & Decision Making 07, no. 04 (December 2008): 683–720. http://dx.doi.org/10.1142/s0219622008003150.

Full text

Abstract:

The purpose of this paper is to provide a more current evaluation and update of web mining research and techniques available. Current advances in each of the three different types of web mining are reviewed in the categories of web content mining, web usage mining, and web structure mining. For each tabulated research work, we examine such key issues as web mining process, methods/techniques, applications, data sources, and software used. Unlike previous investigators, we divide web mining processes into the following five subtasks: (1) resource finding and retrieving, (2) information selection and preprocessing, (3) patterns analysis and recognition, (4) validation and interpretation, and (5) visualization. This paper also reports the comparisons and summaries of selected software for web mining. The web mining software selected for discussion and comparison in this paper are SPSS Clementine, Megaputer PolyAnalyst, ClickTracks by web analytics, and QL2 by QL2 Software Inc. Applications of these selected web mining software to available data sets are discussed together with abundant presentations of screen shots, as well as conclusions and future directions of the research.

APA, Harvard, Vancouver, ISO, and other styles

13

Et. al., V. Aruna,. "A Review on Design and Development Of Sequential Patterns Algorithms In Web Usage Mining." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 2 (April 10, 2021): 1634–39. http://dx.doi.org/10.17762/turcomat.v12i2.1448.

Full text

Abstract:

In the recent years with the advancement in technology, a lot of information is available in different formats and extracting the knowledge from that data has become a very difficult task. Due to the vast amount of information available on the web, users are finding it difficult to extract relevant information or create new knowledge using information available on the web. To solve this problem Web mining techniques are used to discover the interesting patterns from the hidden data .Web Usage Mining (WUM), which is one of the subset of Web Mining helps in extracting the hidden knowledge present in the Web log files , in recognizing various interests of web users and also in discovering customer behaviours. Web Usage mining includes different phases of data mining techniques called Data Pre-processing, Pattern Discovery & Pattern Analysis. This paper presents an updated focused survey on various sequential pattern mining algorithms like apriori-based algorithm , Breadth First Search-based strategy, Depth First Search strategy, sequential closed-pattern algorithm and Incremental pattern mining algorithm which are used in Pattern Discovery Phase of WUM. At last , a comparison is done based on the important key features present in these algorithms. This study gives us better understanding of the approaches of sequential pattern mining.

APA, Harvard, Vancouver, ISO, and other styles

14

H. Al-Hama, Alaa, Mohammad Ala`a Al-Ha ., and Soukaena Hassan Hash . "Applying Data Mining Techniques in Intrusion Detection System on Web and Analysis of Web Usage." Information Technology Journal 5, no. 1 (December 15, 2005): 57–63. http://dx.doi.org/10.3923/itj.2006.57.63.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Banu, P. K. Nizar, and H. Inbarani. "Analysis of Click Stream Patterns using Soft Biclustering Approaches." International Journal of Information Technologies and Systems Approach 4, no. 1 (January 2011): 53–66. http://dx.doi.org/10.4018/jitsa.2011010104.

Full text

Abstract:

As websites increase in complexity, locating needed information becomes a difficult task. Such difficulty is often related to the websites’ design but also ineffective and inefficient navigation processes. Research in web mining addresses this problem by applying techniques from data mining and machine learning to web data and documents. In this study, the authors examine web usage mining, applying data mining techniques to web server logs. Web usage mining has gained much attention as a potential approach to fulfill the requirement of web personalization. In this paper, the authors propose K-means biclustering, rough biclustering and fuzzy biclustering approaches to disclose the duality between users and pages by grouping them in both dimensions simultaneously. The simultaneous clustering of users and pages discovers biclusters that correspond to groups of users that exhibit highly correlated ratings on groups of pages. The results indicate that the fuzzy C-means biclustering algorithm best and is able to detect partial matching of preferences.

APA, Harvard, Vancouver, ISO, and other styles

16

Santosh Kumar Jha. "Technological Prospects of Cloud Computing in Web Mining: Recent Trends and Opportunities." international journal of engineering technology and management sciences 7, no. 1 (February 28, 2023): 98–104. http://dx.doi.org/10.46647/ijetms.2023.v07i01.017.

Full text

Abstract:

Web has immense potentials to grow with new technologies and flourish with new opportunities in almost every sphere of human life. The internet grew by WWW and explored with e-business and social network. The Web is known and participating in mining of various kinds of data ranging from users views and patterns to Bitcoin applications. Web mining includes different set and types of data and extract useful information and from various sources of web and gain knowledge using applicable data mining technique on dataset. Web mining types broadly categorised in three main areas, web uses mining, web content mining, and web structured mining. Each category of web mining is involved to handle issues of heterogeneous behaviour of web data. All three technique and applications are highly required of high end architectures which can facilitate infrastructures and support for all required criteria. Cloud computing is an emerging technology and blowing with intensive support for varieties of applications. The techniques and applications of Web usage mining are extremely demanded in cloud computing. Cloud computing allow these technique to retrieve relevant and useful data through virtually integrated mode of data warehouse. It helps the users to reduce cost and infrastructure for implementation. This paper presents methodologies of web mining using Cloud Computing technology and its prospects.

APA, Harvard, Vancouver, ISO, and other styles

17

RÍOS, SEBASTIÁN A., and JUAN D. VELÁSQUEZ. "FINDING REPRESENTATIVE WEB PAGES BASED ON A SOM AND A REVERSE CLUSTER ANALYSIS." International Journal on Artificial Intelligence Tools 20, no. 01 (February 2011): 93–118. http://dx.doi.org/10.1142/s0218213011000048.

Full text

Abstract:

Enhancing the content and structure of a web site is a very important task which can help to maintain people visiting a web site and gain new visits (or customers). Web mining area helps to enhance a web site organization and contents using data mining algorithms. In particular we may perform Web Mining using a Self Organizing Feature Map (SOFM or SOM) it is always needed an analysis phase by experts. To help analysts to perform this phase after SOFMs' training, many post-processing techniques have been developed (component planes, labels, etc.); however, none of these techniques are useful when working in web mining for off-line enhancements of a web site. In this paper an algorithm called Reverse Cluster Analysis (RCA) will be provided. It aims to identify important web pages based on a self organizing feature map (SOFM) when performing web text mining (WTM) and web usage mining (WUM). We successfully applied this technique in a real web site to show its effectiveness. We have extended previous work performing a comparison with another unsupervised technique, administrators survey and an extended survey.

APA, Harvard, Vancouver, ISO, and other styles

18

Yadao, Shivani, Dr Vinaya Babu A., Dr Midhunchakkaravarthy Janarthanan, and Dr Amiya Bhaumik. "A Semantically Enhanced Deep Neural Network Framework for Reputation System in Web Mining for Covid-19 Twitter Dataset." Webology 19, no. 1 (January 20, 2022): 3911–28. http://dx.doi.org/10.14704/web/v19i1/web19258.

Full text

Abstract:

With the web containing a huge amount of information, the extraction of application oriented understandable data has become easier with web mining. Web Mining is the area that is derived from data mining. Unlike data mining, web mining is used to extract interesting patterns from the information available on the web. When used with deep learning, the pattern recognition becomes much easier. Deep learning works in the same way, how a human brain works in terms of predicting the outcomes when a bulk of information is provided. It deals with mathematical models that recognize the patterns efficiently. The different types of web mining techniques, namely: web content mining (WCM), web structure mining (WSM) and web usage mining (WUM) persists. Researchers and economists around the globe are keen in knowing the impact of the pandemic on the society’s economic status; this work helps find the same using reputation system. As twitter is a hub of different opinions of public, we work with covid- 19 data set from twitter. A reputation system helps finding the socio economic status of the tweets regarding covid-19 dataset. This paper has proposed a framework in which the web mining is implemented using a semantic enhanced deep neural network technique for the reputation system.

APA, Harvard, Vancouver, ISO, and other styles

19

Ridho, Farid, and Fachruddin Mansyur. "ANALISIS POLA PERMINTAAN PUBLIKASI DATA BADAN PUSAT STATISTIK MENGGUNAKAN ASSOCIATION RULE APRIORI." KLIK - KUMPULAN JURNAL ILMU KOMPUTER 7, no. 2 (June 28, 2020): 187. http://dx.doi.org/10.20527/klik.v7i2.322.

Full text

Abstract:

<p><em>BPS is a data provider body in Indonesia. In publishing, BPS uses a variety of media, one of which is the BPS website. To get data through the BPS website, users can visit the website then download the data they need. The services obtained by data users on the BPS website depend on the quality of the website. The better the quality, the better the service experience gained by data users. The method that can be used to improve the quality of a website is the web usage mining method. Web usage mining is the application of data mining techniques on web repositories to study usage patterns. The purpose of this study is to determine the pattern of data publication requests on the BPS website which can later be used as a reference to improve the quality of BPS website services. Based on the results of the study, it was found that data users tend to access the same data with different years simultaneously. For results by grouping data by title without year, obtained quite diverse rules.</em></p><p><em><strong>Keywords</strong></em><em>: </em><em>web usage mining, association rule, apriori</em></p><p><em>BPS merupakan badan penyedia data di Indonesia. Dalam mempublikasikan datanya, BPS menggunakan berbagai media, salah satunya adalah website BPS. Untuk mendapatkan data melalui website BPS, pengguna dapat mengunjungi website kemudian mengunduh data yang mereka butuhkan. Layanan yang didapatkan oleh pengguna data pada website BPS tergantung dari kualitas website tersebut. Semakin baik kualitasnya, semakin baik pula pengalaman pelayanan yang didapatkan oleh pengguna data. Metode yang dapat digunakan untuk meningkatkan kualitas suatu website adalah metode web usage mining. Web usage mining merupakan penerapan tekhnik data mining pada web repositori untuk mempelajari pola penggunaan</em><em>. </em><em>Tujuan dari penelitian ini adalah untuk mengetahui pola permintaan publikasi data pada website BPS yang nantinya dapat digunakan sebagai acuan untuk meningkatkan kualitas layanan website BPS. Berdasarkan hasil penelitian, didapatkan bahwa pengguna data cenderung mengakses data yang sama dengan tahun yang berbeda secara bersamaan. Untuk hasil dengan mengelompokan data berdasarkan judul tanpa tahun, diperoleh rules yang cukup beragam.</em></p><p><em><strong>Kata kunci</strong></em><em>: </em><em>web usage mining, association rule, apriori</em></p>

APA, Harvard, Vancouver, ISO, and other styles

20

Gul, Sumeer, Shohar Bano, and Taseen Shah. "Exploring data mining: facets and emerging trends." Digital Library Perspectives 37, no. 4 (October 20, 2021): 429–48. http://dx.doi.org/10.1108/dlp-08-2020-0078.

Full text

Abstract:

Purpose Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an emerging field and manifests itself in the form of different techniques such as information mining; big data mining; big data mining and Internet of Things (IoT); and educational data mining. This paper aims to discuss how these technologies and techniques are used to derive information and, eventually, knowledge from data. Design/methodology/approach An extensive review of literature on data mining and its allied techniques was carried to ascertain the emerging procedures and techniques in the domain of data mining. Clarivate Analytic’s Web of Science and Sciverse Scopus were explored to discover the extent of literature published on Data Mining and its varied facets. Literature was searched against various keywords such as data mining; information mining; big data; big data and IoT; and educational data mining. Further, the works citing the literature on data mining were also explored to visualize a broad gamut of emerging techniques about this growing field. Findings The study validates that knowledge discovery in databases has rendered data mining as an emerging field; the data present in these databases paves the way for data mining techniques and analytics. This paper provides a unique view about the usage of data, and logical patterns derived from it, how new procedures, algorithms and mining techniques are being continuously upgraded for their multipurpose use for the betterment of human life and experiences. Practical implications The paper highlights different aspects of data mining, its different technological approaches, and how these emerging data technologies are used to derive logical insights from data and make data more meaningful. Originality/value The paper tries to highlight the current trends and facets of data mining.

APA, Harvard, Vancouver, ISO, and other styles

21

Ouf, Shimaa, Yehia Helmy, and Merna Ashraf. "Web Mining Techniques - A Framework to Enhance Customer Retention." International Journal of e-Collaboration 19, no. 1 (January 13, 2023): 1–30. http://dx.doi.org/10.4018/ijec.315790.

Full text

Abstract:

In e-commerce, retaining customers on the web is a difficult task that requires a good understanding of customers' behavior to be able to predict their needs and interests. Web usage mining (WUM), which is the application of data mining techniques to improve business, helps in understanding customers' behavior on the web. Therefore, this paper proposes and implements a framework to enhance the quality of customer recommendations. Providing customers with what they are looking for helps increase their satisfaction, which will lead to improved retention with the company. The proposed framework was tested and evaluated. The result of testing the proposed framework illustrates that the recommendations based on merged techniques (like clustering, classification, association, and sequential discovery) achieve strong accuracy with a precision value of 74%, coverage of 100%, and an average overall efficiency of F-measure of 86%. which means that the merged technique outperformed each technique and attained much higher overall coverage.

APA, Harvard, Vancouver, ISO, and other styles

22

SelviMohana, M., and B. Rosiline Jeetha. "Methodologies on user Behavior Analysis and Future Request Prediction in Web usage Mining using Data mining Techniques." International Journal of Web Technology 003, no. 001 (June 10, 2014): 15–18. http://dx.doi.org/10.20894/ijwt.104.003.001.004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Alshamaila, Yazan, Ibrahim Aljarah, and Ala’ M. Al-Zoubi. "Explaining Individuals’ Usage of Social Commerce: A Data Mining Approach." Modern Applied Science 12, no. 8 (July 28, 2018): 116. http://dx.doi.org/10.5539/mas.v12n8p116.

Full text

Abstract:

With the use of Web 2.0 technology, e-commerce is undergoing a radical change that enriches consumer involvement and enables a better understanding of economic value. This emerging phenomenon is known as social commerce. Social commerce (s-commerce) presents a new alternative for consumers to search for and find information about products they are seeking to buy. In spite of its universality, the adoption of this burgeoning technology is affected by several factors. This research project is an initial attempt to explore individuals’ intention of s-commerce usage through the data mining approach. The data was collected via a web-based questionnaire survey of 360 social network site (SNS) users in Jordan. Data mining techniques were then used to analyze the collected data in order to figure out what group of features is best for predicting s-commerce adoption among SNS users. The results showed that data characteristics related to gender, monthly income, civil status, number of connections, and prior online shopping experience are key factors in the classification process. The findings may assist researchers in investigating social commerce issues and aid practitioners in developing new s-commerce strategies.

APA, Harvard, Vancouver, ISO, and other styles

24

Yang, Zhen Jian, and Ke Wen Xia. "A Recommendation System Framework Based on Web Mining." Applied Mechanics and Materials 151 (January 2012): 576–82. http://dx.doi.org/10.4028/www.scientific.net/amm.151.576.

Full text

Abstract:

Presently recommendation systems have gradually become an important part in E-Commerce, more and more research papers about recommendation systems in E-Commerce appeared in many kinds of conferences and journals. With expanding of E-Commerce it also faces series of challenges. Traditional collaborative filtering recommendation technique is hard to provide recommendation service for unregistered users. To overcome this problem, we suggested a framework of recommendation system based on web mining. It is made up of two parts, offline and online. This method first clustered web usage data, web content data and web structure data respectively, then provided high-quality recommendation services based on mining results. Compared with traditional collaborative filtering techniques, recommendation systems based on web mining are convenient for users because user need not to provide user-rating data explicitly. In end of this paper, accuracy of recommendation system based on web mining was tested and compared with traditional collaborative filtering recommendation system. Testing results showed that, quality of recommendation system based on web mining is better than quality of traditional collaborative filtering recommendation system.

APA, Harvard, Vancouver, ISO, and other styles

25

McGrath, Owen. "Data Mining User Activity in Free and Open Source Software (FOSS)/ Open Learning Management Systems." International Journal of Open Source Software and Processes 2, no. 1 (January 2010): 65–75. http://dx.doi.org/10.4018/jossp.2010010105.

Full text

Abstract:

Free and Open Source Software (FOSS)/Open Educational Systems development projects abound in higher education today. Many universities worldwide have adopted open source software like ATutor and Moodle as an alternative to commercial or homegrown systems. The move to open source learning management systems entails many special considerations, including usage analysis facilities. The tracking of users and their activities poses major technical and analytical challenges within web-based systems. This paper examines how user activity tracking challenges are met with data mining techniques, particularly web usage mining methods, in four different open learning management systems: ATutor, LON-CAPA, Moodle, and Sakai. As examples of data mining technologies adapted within widely used systems, they represent important first steps for moving educational data mining outside the research laboratory. Moreover, as examples of different open source development contexts, exemplify the potential for programmatic integration of data mining technology processes in the future. As open systems mature in the use of educational data mining, they move closer to the long-sought goal of achieving more interactive, personalized, adaptive learning environments online on a broad scale.

APA, Harvard, Vancouver, ISO, and other styles

26

Saleh, Hillal, and Soukaena Hasheem. "A proposed strategy to secure web usage." Journal of Al-Rafidain University College For Sciences ( Print ISSN: 1681-6870 ,Online ISSN: 2790-2293 ), no. 1 (October 27, 2021): 19–27. http://dx.doi.org/10.55562/jrucs.v22i1.489.

Full text

Abstract:

With much data on the web, it can be difficult, frustrating, and seemingly impossible to find the exact information you need. There are many powerful search utilities on the web are called search engines, in addition to the visitor tracking in a web to study exactly the behavior of the web visitors, to improve the efficiency of that web. This research concentrates on a particular aspect, which is applying Data mining technique especially by association analysis algorithm on the encrypted web log files, that to ensure the privacy of the original data for these files. So since the input data introduced to mining algorithm is encrypted then the resulted association rules are encrypted that to ensure the privacy of the extracted knowledge. Then analyze the decrypted web log data file for the web usage, to study the visitor tracking. According to this study the server configurations and all the services will be improved.

APA, Harvard, Vancouver, ISO, and other styles

27

Intayoad, Wacharawan, Chayapol Kamyod, and Punnarumol Temdee. "Synthetic Minority Over-Sampling for Improving Imbalanced Data in Educational Web Usage Mining." ECTI Transactions on Computer and Information Technology (ECTI-CIT) 12, no. 2 (February 20, 2019): 118–29. http://dx.doi.org/10.37936/ecti-cit.2018122.133280.

Full text

Abstract:

Educational data mining is the method for extracting and discovering new knowledge from education data. As education data is often complex and imbalanced, it requires a data preprocessing step or learning algorithms in order to obtain accurate analysis and interpretation. Many studies emphasize on classification and clustering methods in order to get insight and comprehensive knowledge from education data. However, a small number of previous works exclusively focused on the preprocessing of education data, particularly on the topic of the imbalanced dataset. Therefore, this research objective is to enhance the accuracy of data classification in educational web usage data. Our study involves the application of synthetic minority over-sampling techniques (SMOTE) to preprocess the raw dataset from web usage data. The minority class is a group of the students who failed the examination and the majority class is the students who passed the examination. In our experiments, four synthetic minority over-sampling methods are applied, SMOTE, and its variants: Borderline-SMOTE1, Borderline-SMOTE2, and SVM-SMOTE, in order to balance the number of samples in the minority class. The experiments are evaluated by comparing the results from well-known classification methods that are Naive Bayesian, decision tree, and k-nearest neighbors. The study experiments with real-world datasets from education data. The results present that synthetic minority over-sampling methods are capable of improving the detection of the minority class and achieve improving classification performance on precision, recall, and F1-value. Ed

APA, Harvard, Vancouver, ISO, and other styles

28

Muhammad Ahlami Ashraf Roslan, Murizah Kassim,. "Analysisof Students’ Web Browsing Behaviours Using Data Miningat a Campus Network." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 6 (April 5, 2021): 2726–38. http://dx.doi.org/10.17762/turcomat.v12i6.5779.

Full text

Abstract:

Analytics provides insight to people based on the analytics of past usage by using techniques such as statistics, data mining, machine learning and artificial intelligence. Lack of monitoring system of browsing causes low engagements that reduce the growth of certain businesses caused by unnecessary browsing for students learning time. This paper presents an analysis on browsing behavior that classifies browsed words followed their ethical word-groups browsing. An Analytic platform is created as a monitoring system of browsing behavior. Data mining, indexing and classification method are used in this research as data is the essential key of creating a predictive model and four types of ethical groups have been filtered based on the browsing behaviors. The browsed words are categorized into four types of browsing called queries, applications, social media, Campus-related sites. The research method uses software tools and data mining process on the browsing data and analytics is presented on the development of the dashboard mainly using the R programming language. Few unethical words using the indexing method are generated in analytic graphs based on the type of browsing versus time. Data collected from the browsing behaviors of students’analysis taken from browsing database of personal computer and laboratory computer in the campus network. The result shows that othercategories are the highest categories which reached79.6% for personals' computer browsing compared to72.4% browsing at the laboratory computers. It is identified that about 21% of the browsing behavior was filtered during the data mined processed. The other category is still on the research portfolio where these libraries must be filtered in detail to identify whether they are learning or non-learning activities. This research is significant in that helps to increase the effectiveness of suggestions applications, optimize the internet usage by blocking unnecessary words or webpages, and even campus guide systems by monitoring the surrounding browsing behavior of the students’ usages of the campus network computer labs.

APA, Harvard, Vancouver, ISO, and other styles

29

Ali, N. M., A. M. Gadallah, H. A. Hefny, and B. A. Novikov. "Online web navigation assistant." Vestnik Udmurtskogo Universiteta. Matematika. Mekhanika. Komp'yuternye Nauki 31, no. 1 (March 2021): 116–31. http://dx.doi.org/10.35634/vm210109.

Full text

Abstract:

The problem of finding relevant data while searching the internet represents a big challenge for web users due to the enormous amounts of available information on the web. These difficulties are related to the well-known problem of information overload. In this work, we propose an online web assistant called OWNA. We developed a fully integrated framework for making recommendations in real-time based on web usage mining techniques. Our work starts with preparing raw data, then extracting useful information that helps build a knowledge base as well as assigns a specific weight for certain factors. The experiments show the advantages of the proposed model against alternative approaches.

APA, Harvard, Vancouver, ISO, and other styles

30

Gandhimathi, D., and N. Anbazhagan. "Extracting of Positive and Negative Association Rules." International Journal of Emerging Research in Management and Technology 6, no. 8 (June 25, 2018): 421. http://dx.doi.org/10.23956/ijermt.v6i8.175.

Full text

Abstract:

Association rules analysis is a basic technique to expose how items/patterns are associated to each other. There are two common ways to measure association such as Support and Confidence. Several methods have been proposed in the literature to diminish the number of extracted association rules. Association Rule Mining is one of the greatest current data mining techniques designed to group objects together from huge databases aiming to take out the motivating correlation and relation with massive quantity of data. Association rule mining is used to discover the associated patterns from datasets. In this paper, we propose association rules from new methods on web usage mining. Generally, web usage log structure has several records so we have to overcome those unwanted records from large dataset. First of all the pre-processed data from the NASA dataset is clustered by the popular K-Means algorithm. Subsequently, the matrix calculation is progressed on that data. Further, the associations are performed on filtered data and get rid of the final associated page results. Positive and negative association rules are gathered by using new algorithm with Annul Object (𝒜𝒪). Wherever the object “𝒜𝒪” is presented those rules are known as negative association rule. Otherwise, the rules are positive association rules.

APA, Harvard, Vancouver, ISO, and other styles

31

Malik, C. K. Mohammed. "Web Mining Using Improved Apriori Algorithm." International Academic Journal of Innovative Research 9, no. 1 (September 26, 2022): 52–60. http://dx.doi.org/10.9756/iajir/v9i1/iajir0917.

Full text

Abstract:

In this study, we will be concentrating on one of the more recent advancements in data mining, specifically mining online usage. The purpose of web use mining is to gain usable knowledge from the data that web servers keep about the actions of its visitors by mining the data that is stored on such servers. By using the association rule generation in the Web domain, the pages that are most frequently referenced together can be combined into a single server session. This is possible because of the interconnected nature of the Web. In association rule mining, a technique known as frequent set mining is one of the methods that may be used to discover regular patterns from a web log file. When it comes to mining the usage of the web, the term association rules refers to groups of web pages that are accessed together and have a support value that is higher than a given threshold. The support can be expressed as a proportion of total transactions that match a particular pattern. With the aid of the presence or absence of association rules, web designers are able to effectively reconstruct the websites they have created for their clients. In this research, we have introduced a method called Aprior for the purpose of extracting frequent patterns from online log files. The findings of the experiments that were carried out on data relating to peoples use of the website indicate that general sequential patterns or frequent item sets are more suitable for use in Web customization and recommender systems.

APA, Harvard, Vancouver, ISO, and other styles

32

Martyniuk, Hanna, Valeriy Kozlovskiy, Serhii Lazarenko, and Yuriy Balanyuk. "Data Mining Technics and Cyber Hygiene Behaviors in Social Media." South Florida Journal of Development 2, no. 2 (May 26, 2021): 2503–15. http://dx.doi.org/10.46932/sfjdv2n2-108.

Full text

Abstract:

The authors present in this work information about social media and data mining usage for that. It is represented information about social networking sites, where Facebook dominates the industry by boasting an account of 85% of the internet user’s worldwide. Applying data mining techniques to large social media data sets has the potential to continue to improve search results for everyday search engines, realize specialized target marketing for businesses, help psychologist study behavior, provide new insights into social structure for sociologists, personalize web services for consumers, and even help detect and prevent spam for all of us. The most common data mining applications related to social networking sites is represented. Authors have also gave information about different data mining techniques and list of these techniques. It is important to protect personal privacy when working with social network data. Recent publications highlight the need to protect privacy as it has been shown that even anonymizing this type of data can still reveal personal information when advanced data analysis techniques are used. A whole range of different threat of social networks is represented. Authors explain cyber hygiene behaviors in social networks, such as backing up data, identity theft and online behavior.

APA, Harvard, Vancouver, ISO, and other styles

33

Kapusta, Jozef, Anna Pilková, Michal Munk, and Peter Švec. "Data pre-processing for web log mining: Case study of commercial bank website usage analysis." Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 61, no. 4 (2013): 973–79. http://dx.doi.org/10.11118/actaun201361040973.

Full text

Abstract:

We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper.

APA, Harvard, Vancouver, ISO, and other styles

34

Rathipriya, R., K. Thangavel, and J. Bagyamani. "Extraction of Target User Group from Web Usage Data Using Evolutionary Biclustering Approach." International Journal of Applied Metaheuristic Computing 2, no. 3 (July 2011): 69–79. http://dx.doi.org/10.4018/jamc.2011070104.

Full text

Abstract:

Data mining extracts hidden information from a database that the user did not know existed. Biclustering is one of the data mining technique which helps marketing user to target marketing campaigns more accurately and to align campaigns more closely with the needs, wants, and attitudes of customers and prospects. The biclustering results can be tuned to find users’ browsing patterns relevant to current business problems. This paper presents a new application of biclustering to web usage data using a combination of heuristics and meta-heuristics algorithms. Two-way K-means clustering is used to generate the seeds from preprocessed web usage data, Greedy Heuristic is used iteratively to refine a set of seeds, which is fast but often yield local optimal solutions. In this paper, Genetic Algorithm is used as a global optimizer that can be coupled with greedy method to identify the global optimal target user groups based on their coherent browsing pattern. The performance of the proposed work is evaluated by conducting experiment on the msnbc, a clickstream dataset from UCI repository. Results show that the proposed work performs well in extracting optimal target users groups from the web usage data which can be used for focalized marketing campaigns.

APA, Harvard, Vancouver, ISO, and other styles

35

Et. al., G. Srinivas Reddy,. "DATA PROCESSING THROUGH AN ADDITIVE ROTATIONAL PERTURBATION TECHNIQUE IN A SECURED ENVIRONMENT OF PPRIVACY." INFORMATION TECHNOLOGY IN INDUSTRY 9, no. 2 (March 21, 2021): 131–35. http://dx.doi.org/10.17762/itii.v9i2.315.

Full text

Abstract:

As the usage of internet and web applications emerges faster, security and privacy of the data is the most challenging issue which we are facing, leading to the possibility of being easily damaged. Various conventional techniques are used for privacy preservation like condensation, randomization and tree structure etc., the limitations of the existing approaches are, they are not able to maintain proper balance between the data utility and privacy and it may have the problem with privacy violations. This paper presents an Additive Rotation Perturbation approach for Privacy Preserving Data Mining (PPDM). In this proposed work, various dataset from UCI Machine Learning Repository was collected and it is protected with a New Additive Rotational Perturbation Technique under Privacy Preserving Data Mining. Experimental result shows that the proposed algorithm’s strength is high for all the datasets and it is estimated using the DoV (Difference of Variance) method.

APA, Harvard, Vancouver, ISO, and other styles

36

Chandrashaker Reddy, P., and A. Suresh Babu. "Usage of co-event pattern mining with optimal fuzzy rule-based classifier for effective web page retrieval." International Journal of Engineering & Technology 7, no. 3.29 (August 24, 2018): 275. http://dx.doi.org/10.14419/ijet.v7i3.29.18811.

Full text

Abstract:

With the coming of the World Wide Web and the rise of web-based business applications and informal organizations, associations over the web create a lot of information on a daily basis. It is becoming more complex and critical task to retrieve exact information from web expected by its users. In the recent times, the Web has extended its noteworthiness to the point of transforming into the point of convergence of our propelled lives. The search engine as an apparatus to explore the web must get the coveted outcomes for any given query. The greater part of the search engines can't totally fulfill user’s necessities and the outcomes are regularly inaccurate and irrelevant. knowledge of ontology and history is not much personalization in the existing techniques. To conquer these issues, data mining systems must be connected to the web and one advanced powerful concept is web-page recommendation which is becoming more powerful now a day. In this paper, the design of a fuzzy logic classifier algorithm is defined as a search problem in the solution space where every node represents a rule set, membership function, and the particular framework behaviour. Therefore, the hybrid optimization algorithm is applied to search for an optimal location of this solution space which hopefully represents the near optimal rule set and membership function. In this article, we reviewed various techniques proposed by different researchers for web page personalization and proposed a novel approach for finding optimal solutions to search the relevant information..

APA, Harvard, Vancouver, ISO, and other styles

37

Bhandari, Adarsh. "Analyzation and Comparison of Cloud Computing and Data Mining Techniques: Big Data and Impact of Blockchain." International Journal for Research in Applied Science and Engineering Technology 9, no. 11 (November 30, 2021): 712–21. http://dx.doi.org/10.22214/ijraset.2021.38888.

Full text

Abstract:

Abstract: With the rapid escalation of data driven solutions, companies are integrating huge data from multiple sources in order to gain fruitful results. To handle this tremendous volume of data we need cloud based architecture to store and manage this data. Cloud computing has emerged as a significant infrastructure that promises to reduce the need for maintaining costly computing facilities by organizations and scale up the products. Even today heavy applications are deployed on cloud and managed specially at AWS eliminating the need for error prone manual operations. This paper demonstrates about certain cloud computing tools and techniques present to handle big data and processes involved while extracting this data till model deployment and also distinction among their usage. It will also demonstrate, how big data analytics and cloud computing will change methods that will later drive the industry. Additionally, a study is presented later in the paper about management of blockchain generated big data on cloud and making analytical decision. Furthermore, the impact of blockchain in cloud computing and big data analytics has been employed in this paper. Keywords: Cloud Computing, Big Data, Amazon Web Services (AWS), Google Cloud Platform (GCP), SaaS, PaaS, IaaS.

APA, Harvard, Vancouver, ISO, and other styles

38

Bianchi, Gianpiero, Renato Bruni, Cinzia Daraio, Antonio Laureti Palma, Giulio Perani, and Francesco Scalfati. "Exploring the Potentialities of Automatic Extraction of University Webometric Information." Journal of Data and Information Science 5, no. 4 (November 21, 2020): 43–55. http://dx.doi.org/10.2478/jdis-2020-0040.

Full text

Abstract:

AbstractPurposeThe main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’ websites. The information automatically extracted can be potentially updated with a frequency higher than once per year, and be safe from manipulations or misinterpretations. Moreover, this approach allows us flexibility in collecting indicators about the efficiency of universities’ websites and their effectiveness in disseminating key contents. These new indicators can complement traditional indicators of scientific research (e.g. number of articles and number of citations) and teaching (e.g. number of students and graduates) by introducing further dimensions to allow new insights for “profiling” the analyzed universities.Design/methodology/approachWebometrics relies on web mining methods and techniques to perform quantitative analyses of the web. This study implements an advanced application of the webometric approach, exploiting all the three categories of web mining: web content mining; web structure mining; web usage mining. The information to compute our indicators has been extracted from the universities’ websites by using web scraping and text mining techniques. The scraped information has been stored in a NoSQL DB according to a semi-structured form to allow for retrieving information efficiently by text mining techniques. This provides increased flexibility in the design of new indicators, opening the door to new types of analyses. Some data have also been collected by means of batch interrogations of search engines (Bing, www.bing.com) or from a leading provider of Web analytics (SimilarWeb, http://www.similarweb.com). The information extracted from the Web has been combined with the University structural information taken from the European Tertiary Education Register (https://eter.joanneum.at/#/home), a database collecting information on Higher Education Institutions (HEIs) at European level. All the above was used to perform a clusterization of 79 Italian universities based on structural and digital indicators.FindingsThe main findings of this study concern the evaluation of the potential in digitalization of universities, in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities’ websites. These indicators can complement traditional indicators and can be used to identify groups of universities with common features using clustering techniques working with the above indicators.Research limitationsThe results reported in this study refers to Italian universities only, but the approach could be extended to other university systems abroad.Practical implicationsThe approach proposed in this study and its illustration on Italian universities show the usefulness of recently introduced automatic data extraction and web scraping approaches and its practical relevance for characterizing and profiling the activities of universities on the basis of their websites. The approach could be applied to other university systems.Originality/valueThis work applies for the first time to university websites some recently introduced techniques for automatic knowledge extraction based on web scraping, optical character recognition and nontrivial text mining operations (Bruni & Bianchi, 2020).

APA, Harvard, Vancouver, ISO, and other styles

39

Alphy, Anna, and S. Prabakaran. "A Dynamic Recommender System for Improved Web Usage Mining and CRM Using Swarm Intelligence." Scientific World Journal 2015 (2015): 1–16. http://dx.doi.org/10.1155/2015/193631.

Full text

Abstract:

In modern days, to enrich e-business, the websites are personalized for each user by understanding their interests and behavior. The main challenges of online usage data are information overload and their dynamic nature. In this paper, to address these issues, a WebBluegillRecom-annealing dynamic recommender system that uses web usage mining techniques in tandem with software agents developed for providing dynamic recommendations to users that can be used for customizing a website is proposed. The proposed WebBluegillRecom-annealing dynamic recommender uses swarm intelligence from the foraging behavior of a bluegill fish. It overcomes the information overload by handling dynamic behaviors of users. Our dynamic recommender system was compared against traditional collaborative filtering systems. The results show that the proposed system has higher precision, coverage,F1measure, and scalability than the traditional collaborative filtering systems. Moreover, the recommendations given by our system overcome the overspecialization problem by including variety in recommendations.

APA, Harvard, Vancouver, ISO, and other styles

40

Rathipriya, R., and K. Thangavel. "A Discrete Artificial Bees Colony Inspired Biclustering Algorithm." International Journal of Swarm Intelligence Research 3, no. 1 (January 2012): 30–42. http://dx.doi.org/10.4018/jsir.2012010102.

Full text

Abstract:

Biclustering methods are the potential data mining technique that has been suggested to identify local patterns in the data. Biclustering algorithms are used for mining the web usage data which can determine a group of users which are correlated under a subset of pages of a web site. Recently, many blistering methods based on meta-heuristics have been proposed. Most use the Mean Squared Residue as merit function but interesting and relevant patterns such as shifting and scaling patterns may not be detected using this measure. However, it is important to discover this type of pattern since commonly the web users can present a similar behavior although their interest levels vary in different ranges or magnitudes. In this paper a new correlation based fitness function is designed to extract shifting and scaling browsing patterns. The proposed work uses a discrete version of Artificial Bee Colony optimization algorithm for biclustering of web usage data to produce optimal biclusters (i.e., highly correlated biclusters). It’s demonstrated on real dataset and its results show that proposed approach can find significant biclusters of high quality and has better convergence performance than Binary Particle Swarm Optimization (BPSO).

APA, Harvard, Vancouver, ISO, and other styles

41

Roy, Rita, and Giduturi Appa Rao. "A Framework for an Efficient Recommendation System Using Time and Fairness Constraint Based Web Usage Mining Technique." Ingénierie des systèmes d information 27, no. 3 (June 30, 2022): 425–31. http://dx.doi.org/10.18280/isi.270308.

Full text

Abstract:

Users prefer to use various websites like Facebook, Gmail, and YouTube. We can make the system predict what pages we expect in the future and give the users what they have requested. Based on the data gathered and analyzed, we can predict the user's future navigation patterns in response to the user's requests. In order to track down users’ navigational sessions, the web access logs created at a specific website are processed. Grouping the user session data is then done into clusters, where inter-cluster similarities are minimized, although the intra-cluster similarities are maximised. Recent clustering and fairness analysis research has focused on centric-based methods such as k-median and k-means clustering. We propose improved constrained based clustering (ICBC) based on fair algorithms for managing Hierarchical Agglomerative Clustering (HAC) that apply fairness constraints regardless of distance linking parameters, simplifying clustering fairness trials for HAC and intended for various protected groups compared to vanilla HAC techniques. Also, this ICBC is used to select an algorithm whose inherent bias matches a specific problem, and then to adjust the optimization criterion of any distinct algorithm to take the constraints on interpretation to improve the efficiency of clustering. We show that our proposed algorithm finds fairer clustering by evaluation on the NASA dataset by balancing the constraints of the problem.

APA, Harvard, Vancouver, ISO, and other styles

42

NASRAOUI, OLFA, and RAGHU KRISHNAPURAM. "AN EVOLUTIONARY APPROACH TO MINING ROBUST MULTI-RESOLUTION WEB PROFILES AND CONTEXT SENSITIVE URL ASSOCIATIONS." International Journal of Computational Intelligence and Applications 02, no. 03 (September 2002): 339–48. http://dx.doi.org/10.1142/s1469026802000646.

Full text

Abstract:

We present a technique for simultaneously mining Web navigation patterns and maximally frequent context-sensitive itemsets (URL associations) from the historic user access data stored in Web server logs. A new hierarchical clustering technique that exploits the symbiosis between clusters in feature space and genetic biological niches in nature, called Hierarchical Unsupervised Niche Clustering (H-UNC) is presented. We use H-UNC as part of a complete system of knowledge discovery in Web usage data. Our approach does not necessitate fixing the number of clusters in advance, is insensitive to initialization, can handle noisy data, general non-differentiable similarity measures, and automatically provides profiles at multiple resolution levels. Our experiments show that our algorithm is not only capable of extracting meaningful user profiles on real Web sites, but also discovers associations between distinct URL pages on a site, with no additional cost. Unlike content based association methods, our approach discovers associations between different Web pages based only on the user access patterns and not on the page content. Also, unlike traditional context-blind association discovery methods, H-UNC discovers context-sensitive associations which are only meaningful within a limited context/user profile.

APA, Harvard, Vancouver, ISO, and other styles

43

Ravichandran, S., and J. Sathiamoorthy. "An Innovative Method of Estimation Hewing for Invention Report Mining and Estimation Summarization." Asian Journal of Computer Science and Technology 9, no. 2 (November 5, 2020): 45–50. http://dx.doi.org/10.51983/ajcst-2020.9.2.2169.

Full text

Abstract:

With the assistance of Web 2.0, the bases on client interest, posting on the web surveys has become an undeniably mainstream path for individuals to impart their perspectives to different client’s suppositions and conclusions toward items and administrations. It turns into a typical practice for online business sites to give the offices to individuals to convey and distribute their audits between them. These online audits present an abundance of data on the Services and Products, which will encourage the improvement of their business. Consequently a developing number of late examinations have been centred on the Opinion Mining. For example the Opinion Mining alludes to computational method for assessing the sentiments that are mined from different Web Sources. A couple of Opinion Mining based techniques have been considered and broke down. From our investigation, it is seen that a couple of feeling mining based directed and unaided techniques had not delivered great outcomes because of alluding less number of sentiments inside a similar URL’S and treating the highlights with comparable significance as various. To beat this issue, Topic Anatomy Model TSCAN was proposed, where the Task is called as Topic Anatomy and which sums up and relates the primary pieces of a point with the goal that the per users could comprehend the substance without any problem. By utilizing this model, the more data can be removed and related through their transient closeness, which will give conceivable substance. This model is including imperative part in the Opinion Mining since clients can impart their insights about the items. From our usage, it is seen that this plan gives the best reasonable answer for the client’s advantages and requests. Notwithstanding, it burns-through more opportunity to anticipate the best performing items because of huge informational collections respectively. Consequently our exploration work is proposed and actualized a productive strategy for Opinion Mining called an Efficient Parallel Opinion Mining (EPOM) constructed TSCAN Algorithm separately. It is centring more sites and it is removing more data in equal way, so we can get advanced productive outcome with least execution time. From our outcomes, it is noticed that it gives the best reasonable answer for the client’s advantages and requests and it I s improving the presentation of existing method regarding Quality of Information, Prediction and Execution Time.

APA, Harvard, Vancouver, ISO, and other styles

44

Dhanalakshmi, B., and A. Chandrasekar. "Analyzing Student's Performance Using Efficient Opinion Mining and Ranking Method with Machine Learning Techniques." Journal of Computational and Theoretical Nanoscience 15, no. 2 (February 1, 2018): 480–84. http://dx.doi.org/10.1166/jctn.2018.7129.

Full text

Abstract:

In recent years, web technology plays a vital role in research areas. The requirement of internet is considerably increasing day by day. Now everyone needs a fast and quick internet transaction for transaction. Online shopping is the best example for effective transaction between goods and money without the wastage of time going and buying things from market. Thus the usage of internet is increasing nowadays. In this paper, we proposed a framework to collect the details of the students from all subject teachers whether the student is eligible to promote to next class. First the data is collected from the teachers which are unstructured. The collected details from various subject staff are extracted using feature extraction. Then the opinion can be extracted whether to promote the student to next class can be identified by opinion mining. If the opinion given by the teacher is positive (i.e.,) the student passed in 3 subjects then he/she will get promoted. If it is positive then he/she will not be promoted to next class. The opinion given by the teachers can be classified based on the writing skill, reading skill, listening skill, etc. In final step, Ranking will be done whether the student is promoted or not.

APA, Harvard, Vancouver, ISO, and other styles

45

Trakunphutthirak, Ruangsak, Yen Cheung, and Vincent C. S. Lee. "A Study of Educational Data Mining: Evidence from a Thai University." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 734–41. http://dx.doi.org/10.1609/aaai.v33i01.3301734.

Full text

Abstract:

Educational data mining provides a way to predict student academic performance. A psychometric factor like time management is one of the major issues affecting Thai students’ academic performance. Current data sources used to predict students’ performance are limited to the manual collection of data or data from a single unit of study which cannot be generalised to indicate overall academic performance. This study uses an additional data source from a university log file to predict academic performance. It investigates the browsing categories and the Internet access activities of students with respect to their time management during their studies. A single source of data is insufficient to identify those students who are at-risk of failing in their academic studies. Furthermore, there is a paucity of recent empirical studies in this area to provide insights into the relationship between students’ academic performance and their Internet access activities. To contribute to this area of research, we employed two datasets such as web-browsing categories and Internet access activity types to select the best outcomes, and compared different weights in the time and frequency domains. We found that the random forest technique provides the best outcome in these datasets to identify those students who are at-risk of failure. We also found that data from their Internet access activities reveals more accurate outcomes than data from browsing categories alone. The combination of two datasets reveals a better picture of students’ Internet usage and thus identifies students who are academically at-risk of failure. Further work involves collecting more Internet access log file data, analysing it over a longer period and relating the period of data collection with events during the academic year.

APA, Harvard, Vancouver, ISO, and other styles

46

Papadakis, Nikos, Haridimos Kondylakis, Anastasios Kalaentzis, Ioannis Komporakis, Ioannis A. Deligiannis, Malvina Steiakaki, George Alexiou, and Xanthoula Atsalaki. "BlogSearch: Semantic Services for Aggregating and Searching Blog Articles." International Journal of Semantic Computing 10, no. 03 (September 2016): 399–415. http://dx.doi.org/10.1142/s1793351x16500033.

Full text

Abstract:

The advent of new web technologies and the explosion of available information online led to an information overload. During this information revolution blogs have become considerably mainstream as a media of providing news. Although there are several arguments about their validity and credibility the large amount of blogs currently available require the usage of advanced techniques for the collection, analysis, mining and efficient querying of the available information. To this direction we present BlogSearch, a novel platform allowing aggregating, indexing and searching blog articles. The information is modelled using a novel RDF/S Ontology named Blogs Ontology and published as Linked Open Data. In addition, two sets of APIs are provided for inserting, updating and searching information whereas the platform provides also graphical user interfaces (GUIs) for searching and inserting information. To the best of our knowledge our platform is the only one currently available publishing blog articles as Linked Open Data and simultaneously providing APIs and GUIs for aggregating, inserting and searching articles.

APA, Harvard, Vancouver, ISO, and other styles

47

Kumar, Biresh, Sharmistha Roy, Anurag Sinha, Celestine Iwendi, and Ľubomíra Strážovská. "E-Commerce Website Usability Analysis Using the Association Rule Mining and Machine Learning Algorithm." Mathematics 11, no. 1 (December 21, 2022): 25. http://dx.doi.org/10.3390/math11010025.

Full text

Abstract:

The overall effectiveness of a website as an e-commerce platform is influenced by how usable it is. This study aimed to find out if advanced web metrics, derived from Google Analytics software, could be used to evaluate the overall usability of e-commerce sites and identify potential usability issues. It is simple to gather web indicators, but processing and interpretation take time. This data is produced through several digital channels, including mobile. Big data has proven to be very helpful in a variety of online platforms, including social networking and e-commerce websites, etc. The sheer amount of data that needs to be processed and assessed to be useful is one of the main issues with e-commerce today as a result of the digital revolution. Additionally, on social media a crucial growth strategy for e-commerce is the usage of BDA capabilities as a guideline to boost sales and draw clients for suppliers. In this paper, we have used the KMP algorithm-based multivariate pruning method for web-based web index searching and different web analytics algorithm with machine learning classifiers to achieve patterns from transactional data gathered from e-commerce websites. Moreover, through the use of log-based transactional data, the research presented in this paper suggests a new machine learning-based evaluation method for evaluating the usability of e-commerce websites. To identify the underlying relationship between the overall usability of the eLearning system and its predictor factors, three machine learning techniques and multiple linear regressions are used to create prediction models. This strategy will lead the e-commerce industry to an economically profitable stage. This capability can assist a vendor in keeping track of customers and items they have viewed, as well as categorizing how customers use their e-commerce emporium so the vendor can cater to their specific needs. It has been proposed that machine learning models, by offering trustworthy prognoses, can aid in excellent usability. Such models might be incorporated into an online prognostic calculator or tool to help with treatment selection and possibly increase visibility. However, none of these models have been recommended for use in reusability because of concerns about the deployment of machine learning in e-commerce and technical issues. One problem with machine learning science that needs to be solved is explainability. For instance, let us say B is 10 and all the people in our population are even. The hash function’s behavior is not random since only buckets 0, 2, 4, 6, and 8 can be the value of h(x). However, if B = 11, we would find that 1/11th of the even integers is transmitted to each of the 11 buckets. The hash function would work well in this situation.

APA, Harvard, Vancouver, ISO, and other styles

48

Alonso-Secades, Vidal, Alfonso-José López-Rivero, Manuel Martín-Merino-Acera, Manuel-José Ruiz-García, and Olga Arranz-García. "Designing an Intelligent Virtual Educational System to Improve the Efficiency of Primary Education in Developing Countries." Electronics 11, no. 9 (May 6, 2022): 1487. http://dx.doi.org/10.3390/electronics11091487.

Full text

Abstract:

Incorporating technology into virtual education encourages educational institutions to demand a migration from the current learning management system towards an intelligent virtual educational system, seeking greater benefit by exploiting the data generated by students in their day-to-day activities. Therefore, the design of these intelligent systems must be performed from a new perspective, which will take advantage of the new analytical functions provided by technologies such as artificial intelligence, big data, educational data mining techniques, and web analytics. This paper focuses on primary education in developing countries, showing the design of an intelligent virtual educational system to improve the efficiency of primary education through recommendations based on reliable data. The intelligent system is formed of four subsystems: data warehousing, analytical data processing, monitoring process and recommender system for educational agents. To illustrate this, the paper contains two dashboards that analyze, respectively, the digital resources usage time and an aggregate profile of teachers’ digital skills, in order to infer new activities that improve efficiency. These intelligent virtual educational systems focus the teaching–learning process on new forms of interaction on an educational future oriented to personalized teaching for the students, and new evaluation and teaching processes for each professor.

APA, Harvard, Vancouver, ISO, and other styles

49

Husin, Husna Sarirah, James Thom, and Xiuzhen Zhang. "Evolution of user navigation behavior for online news." International Journal of Web Information Systems 18, no. 1 (October 28, 2021): 1–22. http://dx.doi.org/10.1108/ijwis-06-2021-0064.

Full text

Abstract:

Purpose The purpose of the study is to use web serer logs in analyzing the changes of user behavior in reading online news, in terms of desktop and mobile users. Advances in mobile technology and social media have paved the way for online news consumption to evolve. There is an absence of research into the changes of user behavior in terms of desktop versus mobile users, particularly by analyzing the server logs. Design/methodology/approach In this paper, the authors investigate the evolution of user behavior using logs from the Malaysian newspaper Berita Harian Online in April 2012 and April 2017. Web usage mining techniques were used for pre-processing the logs and identifying user sessions. A Markov model is used to analyze navigation flows, and association rule mining is used to analyze user behavior within sessions. Findings It was found that page accesses have increased tremendously, particularly from Android phones, and about half of the requests in 2017 are referred from Facebook. Navigation flow between the main page, articles and section pages has changed from 2012 to 2017; while most users started navigation with the main page in 2012, readers often started with an article in 2017. Based on association rules, National and Sports are the most frequent section pages in 2012 and 2017 for desktop and mobile. However, based on the lift and conviction, these two sections are not read together in the same session as frequently as might be expected. Other less popular items have higher probability of being read together in a session. Research limitations/implications The localized data set is from Berita Harian Online; although unique to this particular newspaper, the findings and the methodology for investigating user behavior can be applied to other online news. On another note, the data set could be extended to be more than a month. Although initially data for the year 2012 was collected, unfortunately only the data for April 2012 is complete. Other months have missing days. Therefore, to make an impartial comparison for the evolution of user behavior in five years, the Web server logs for April 2017 were used. Originality/value The user behavior in 2012 and 2017 was compared using association rules and Markov flow. Different from existing studies analyzing online newspaper Web server logs, this paper uniquely investigates changes in user behavior as a result of mobile phones becoming a mainstream technology for accessing the Web.

APA, Harvard, Vancouver, ISO, and other styles

50

Bar-Hen, A., N. Paragios, and A. Flahault. "Public Health and Epidemiology Informatics." Yearbook of Medical Informatics 25, no. 01 (August 2016): 240–46. http://dx.doi.org/10.15265/iy-2016-021.

Full text

Abstract:

Summary Objectives: The aim of this manuscript is to provide a brief overview of the scientific challenges that should be addressed in order to unlock the full potential of using data from a general point of view, as well as to present some ideas that could help answer specific needs for data understanding in the field of health sciences and epidemiology. Methods: A survey of uses and challenges of big data analyses for medicine and public health was conducted. The first part of the paper focuses on big data techniques, algorithms, and statistical approaches to identify patterns in data. The second part describes some cutting-edge applications of analyses and predictive modeling in public health. Results: In recent years, we witnessed a revolution regarding the nature, collection, and availability of data in general. This was especially striking in the health sector and particularly in the field of epidemiology. Data derives from a large variety of sources, e.g. clinical settings, billing claims, care scheduling, drug usage, web based search queries, and Tweets. Conclusion: The exploitation of the information (data mining, artificial intelligence) relevant to these data has become one of the most promising as well challenging tasks from societal and scientific viewpoints in order to leverage the information available and making public health more efficient.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!