Дисертації з теми "Web usage data mining techniques"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Web usage data mining techniques".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Khalil, Faten. "Combining web data mining techniques for web page access prediction." University of Southern Queensland, Faculty of Sciences, 2008. http://eprints.usq.edu.au/archive/00004341/.
Повний текст джерелаNagi, Mohamad. "Integrating Network Analysis and Data Mining Techniques into Effective Framework for Web Mining and Recommendation. A Framework for Web Mining and Recommendation." Thesis, University of Bradford, 2015. http://hdl.handle.net/10454/14200.
Повний текст джерелаKhasawneh, Natheer Yousef. "Toward Better Website Usage: Leveraging Data Mining Techniques and Rough Set Learning to Construct Better-to-use Websites." Akron, OH : University of Akron, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=akron1120534472.
Повний текст джерела"August, 2005." Title from electronic dissertation title page (viewed 01/14/2006) Advisor, John Durkin; Committee members, John Welch, James Grover, Yueh-Jaw Lin, Yingcai Xiao, Chien-Chung Chan; Department Chair, Alex Jose De Abreu-Garcia; Dean of the College, George Haritos; Dean of the Graduate School, George R. Newkome. Includes bibliographical references.
Ammari, Ahmad N. "Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns : the development and evaluation of new Web mining methods that enhance information retrieval and improve the understanding of users' Web behavior in websites and social blogs." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/5269.
Повний текст джерелаAmmari, Ahmad N. "Transforming user data into user value by novel mining techniques for extraction of web content, structure and usage patterns. The Development and Evaluation of New Web Mining Methods that enhance Information Retrieval and improve the Understanding of User¿s Web Behavior in Websites and Social Blogs." Thesis, University of Bradford, 2010. http://hdl.handle.net/10454/5269.
Повний текст джерелаNorguet, Jean-Pierre. "Semantic analysis in web usage mining." Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210890.
Повний текст джерелаIndeed, according to organizations theory, the higher levels in the organizations need summarized and conceptual information to take fast, high-level, and effective decisions. For Web sites, these levels include the organization managers and the Web site chief editors. At these levels, the results produced by Web analytics tools are mostly useless. Indeed, most of these results target Web designers and Web developers. Summary reports like the number of visitors and the number of page views can be of some interest to the organization manager but these results are poor. Finally, page-group and directory hits give the Web site chief editor conceptual results, but these are limited by several problems like page synonymy (several pages contain the same topic), page polysemy (a page contains several topics), page temporality, and page volatility.
Web usage mining research projects on their part have mostly left aside Web analytics and its limitations and have focused on other research paths. Examples of these paths are usage pattern analysis, personalization, system improvement, site structure modification, marketing business intelligence, and usage characterization. A potential contribution to Web analytics can be found in research about reverse clustering analysis, a technique based on self-organizing feature maps. This technique integrates Web usage mining and Web content mining in order to rank the Web site pages according to an original popularity score. However, the algorithm is not scalable and does not answer the page-polysemy, page-synonymy, page-temporality, and page-volatility problems. As a consequence, these approaches fail at delivering summarized and conceptual results.
An interesting attempt to obtain such results has been the Information Scent algorithm, which produces a list of term vectors representing the visitors' needs. These vectors provide a semantic representation of the visitors' needs and can be easily interpreted. Unfortunately, the results suffer from term polysemy and term synonymy, are visit-centric rather than site-centric, and are not scalable to produce. Finally, according to a recent survey, no Web usage mining research project has proposed a satisfying solution to provide site-wide summarized and conceptual audience metrics.
In this dissertation, we present our solution to answer the need for summarized and conceptual audience metrics in Web analytics. We first described several methods for mining the Web pages output by Web servers. These methods include content journaling, script parsing, server monitoring, network monitoring, and client-side mining. These techniques can be used alone or in combination to mine the Web pages output by any Web site. Then, the occurrences of taxonomy terms in these pages can be aggregated to provide concept-based audience metrics. To evaluate the results, we implement a prototype and run a number of test cases with real Web sites.
According to the first experiments with our prototype and SQL Server OLAP Analysis Service, concept-based metrics prove extremely summarized and much more intuitive than page-based metrics. As a consequence, concept-based metrics can be exploited at higher levels in the organization. For example, organization managers can redefine the organization strategy according to the visitors' interests. Concept-based metrics also give an intuitive view of the messages delivered through the Web site and allow to adapt the Web site communication to the organization objectives. The Web site chief editor on his part can interpret the metrics to redefine the publishing orders and redefine the sub-editors' writing tasks. As decisions at higher levels in the organization should be more effective, concept-based metrics should significantly contribute to Web usage mining and Web analytics.
Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished
Sobolewska, Katarzyna-Ewa. "Web links utility assessment using data mining techniques." Thesis, Blekinge Tekniska Högskola, Avdelningen för programvarusystem, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2936.
Повний текст джерелаakasha.kate@gmail.com
Bayir, Murat Ali. "A New Reactive Method For Processing Web Usage Data." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/12607323/index.pdf.
Повний текст джерелаSmart-SRA'
is introduced. Web usage mining is a type of web mining, which exploits data mining techniques to discover valuable information from navigations of Web users. As in classical data mining, data processing and pattern discovery are the main issues in web usage mining. The first phase of the web usage mining is the data processing phase including session reconstruction. Session reconstruction is the most important task of web usage mining since it directly affects the quality of the extracted frequent patterns at the final step, significantly. Session reconstruction methods can be classified into two categories, namely '
reactive'
and '
proactive'
with respect to the data source and the data processing time. If the user requests are processed after the server handles them, this technique is called as &lsquo
reactive&rsquo
, while in &lsquo
proactive&rsquo
strategies this processing occurs during the interactive browsing of the web site. Smart-SRA is a reactive session reconstruction techique, which uses web log data and the site topology. In order to compare Smart-SRA with previous reactive methods, a web agent simulator has been developed. Our agent simulator models behavior of web users and generates web user navigations as well as the log data kept by the web server. In this way, the actual user sessions will be known and the successes of different techniques can be compared. In this thesis, it is shown that the sessions generated by Smart-SRA are more accurate than the sessions constructed by previous heuristics.
Wu, Hao-cun, and 吳浩存. "A multidimensional data model for monitoring web usage and optimizing website topology." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B29528215.
Повний текст джерелаÖzakar, Belgin Püskülcü Halis. "Finding And Evaluating Patterns In Wes Repository Using Database Technology And Data Mining Algorithms/." [s.l.]: [s.n.], 2002. http://library.iyte.edu.tr/tezler/master/bilgisayaryazilimi/T000130.pdf.
Повний текст джерелаKarlsson, Sophie. "Datainsamling med Web Usage Mining : Lagringsstrategier för loggning av serverdata." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-9467.
Повний текст джерелаWeb applications complexity and the amount of advanced services increases. Logging activities can increase the understanding of users behavior and needs, but is used too much without relevant information. More advanced systems brings increased requirements for performance and logging becomes even more demanding for the systems. There is need of smarter systems, development within the techniques for performance improvements and techniques for data collection. This work will investigate how response times are affected when logging server data, according to the data collection phase in web usage mining, depending on storage strategies. The hypothesis is that logging may degrade response times even further. An experiment was conducted in which four different storage strategies are used to store server data with different table- and database structures, to see which strategy affects the response times least. The experiment proves statistically significant difference between the storage strategies with ANOVA. Storage strategy 4 proves the best effect for the performance average response time compared with storage strategy 2, which proves the most negative effect for the average response time. Future work would be interesting for strengthening the results.
Shun, Yeuk Kiu. "Web mining from client side user activity log /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?COMP%202002%20SHUN.
Повний текст джерелаIncludes bibliographical references (leaves 85-90). Also available in electronic version. Access restricted to campus users.
Wang, Hui. "Mining novel Web user behavior models for access prediction /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20WANG.
Повний текст джерелаIncludes bibliographical references (leaves 83-91). Also available in electronic version. Access restricted to campus users.
Zhao, Hongkun. "Automatic wrapper generation for the extraction of search result records from search engines." Diss., Online access via UMI:, 2007.
Знайти повний текст джерелаAgarwal, Khushbu. "A partition based approach to approximate tree mining a memory hierarchy perspective /." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1196284256.
Повний текст джерелаFärholt, Fredric. "Less Detectable Web Scraping Techniques." Thesis, Linnéuniversitetet, Institutionen för datavetenskap och medieteknik (DM), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-104887.
Повний текст джерелаWebbskrapning är ett effektivt sätt att hämta data på, det har även blivit en aktivitet som är enkel att genomföra och chansen att en lyckas är hög. Användare behöver inte längre vara fantaster inom teknik när de skrapar data, det finns idag mängder olika och lättanvändliga plattformstjänster. Den här studien utför experi- ment för att se hur personer kan skrapa på ett oupptäckbart sätt med ett populärt och intelligent JavaScript bibliotek (Puppeteer). Tre webbskrapningsalgoritmer, där två av dem använder rörelsemönster från riktiga webbanvändare, demonstrerar hur en kan samla information. Webbskrapningsalgoritmerna har körts på en hemsida som ingått i experimentet med kännbar säkerhet, honeypot, och aktivitetsloggning, nå- got som gjort det möjligt att samla och utvärdera data från både algoritmerna och hemsidan. Resultatet visar att det kan vara möljligt att skrapa på ett oupptäckbart sätt genom att använda Puppeteer. En av algoritmerna avslöjar även möjligheten att kontrollera prestanda genom att använda inbyggda metoder i Puppeteer.
Vollino, Bruno Winiemko. "Descoberta de perfis de uso de web services." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/83669.
Повний текст джерелаDuring the life cycle of a web service, several changes are made in its interface, which possibly are incompatible with regard to current usage and may break client applications. Providers must make decisions about changes on their services, most often without insight on the effect these changes will have over their customers. Existing research and tools fail to input provider with proper knowledge about the actual usage of the service interface’s features, considering the distinct types of customers, making it impossible to assess the actual impact of changes. This work presents a framework for the discovery of web service usage profiles, which constitute a descriptive model of the usage patterns found in distinct groups of clients, concerning the usage of service interface features. The framework supports a user in the process of knowledge discovery over service usage data through semi-automatic and configurable tasks, which assist the preparation and analysis of usage data with the minimum user intervention possible. The framework performs the monitoring of web services interactions, loads pre-processed usage data into a unified database, and supports the generation of usage profiles. Data mining techniques are used to group clients according to their usage patterns of features, and these groups are used to build service usage profiles. The entire process is configured via parameters, which allows the user to determine the level of detail of the usage information included in the profiles, and the criteria for evaluating the similarity between client applications. The proposal is validated through experiments with synthetic data, simulated according to features expected in the use of a real service. The experimental results demonstrate that the proposed framework allows the discovery of useful service usage profiles, and provide evidences about the proper parameterization of the framework.
Pabarškaitė, Židrina. "Enhancements of pre-processing, analysis and presentation techniques in web log mining." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2009. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2009~D_20090713_142203-05841.
Повний текст джерелаInternetui skverbiantis į mūsų gyvenimą, vis didesnis dėmesys kreipiamas į informacijos pateikimo kokybę, bei į tai, kaip informacija yra pateikta. Disertacijos tyrimų sritis yra žiniatinklio serverių kaupiamų duomenų gavyba bei duomenų pateikimo galutiniam naudotojui gerinimo būdai. Tam reikalingos žinios išgaunamos iš žiniatinklio serverio žurnalo įrašų, kuriuose fiksuojama informacija apie išsiųstus vartotojams žiniatinklio puslapius. Darbo tyrimų objektas yra žiniatinklio įrašų gavyba, o su šiuo objektu susiję dalykai: žiniatinklio duomenų paruošimo etapų tobulinimas, žiniatinklio tekstų analizė, duomenų analizės algoritmai prognozavimo ir klasifikavimo uždaviniams spręsti. Pagrindinis disertacijos tikslas – perprasti svetainių naudotojų elgesio formas, tiriant žiniatinklio įrašus, tobulinti paruošimo, analizės ir rezultatų interpretavimo etapų metodologijas. Darbo tyrimai atskleidė naujas žiniatinklio duomenų analizės galimybes. Išsiaiškinta, kad internetinių duomenų – žiniatinklio įrašų švarinimui buvo skirtas nepakankamas dėmesys. Parodyta, kad sumažinus nereikšmingų įrašų kiekį, duomenų analizės procesas tampa efektyvesnis. Todėl buvo sukurtas naujas metodas, kurį pritaikius žinių pateikimas atitinka tikruosius vartotojų maršrutus. Tyrimo metu nustatyta, kad naudotojų naršymo istorija yra skirtingų ilgių, todėl atlikus specifinį duomenų paruošimą – suformavus fiksuoto ilgio vektorius, tikslinga taikyti iki šiol nenaudotus praktikoje sprendimų medžių algoritmus... [toliau žr. visą tekstą]
Villar, Escobar Osvaldo Pablo. "Minería y Personalización de un Sitio Web para Celulares." Tesis, Universidad de Chile, 2007. http://www.repositorio.uchile.cl/handle/2250/104823.
Повний текст джерелаKliegr, Tomáš. "Clickstream Analysis." Master's thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-2065.
Повний текст джерелаNenadić, Oleg. "An implementation of correspondence analysis in R and its application in the analysis of web usage /." Göttingen : Cuvillier, 2007. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=016229974&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.
Повний текст джерелаPersson, Pontus. "Identifying Early Usage Patterns That Increase User Retention Rates In A Mobile Web Browser." Thesis, Linköpings universitet, Databas och informationsteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-137793.
Повний текст джерелаGomes, João Fernando dos Anjos. "Recomendação de navegação em portais da internet como um serviço suportado em ferramentas Web Analytics." Master's thesis, Instituto Politécnico de Setúbal. Escola Superior de Ciências Empresariais, 2016. http://hdl.handle.net/10400.26/17292.
Повний текст джерелаCom o constante crescimento da utilização da Internet o número de websites e respetivas páginas contínua a evoluir também, por este motivo, verifica-se uma necessidade de alinhar a experiência de utilização com os objetivos gerais de um website. Para satisfazer esta necessidade o sistema de recomendação proposto sugere páginas ao utilizador que possam ser do seu interesse com base em perfis de navegação de um website em geral. A maioria dos sistemas de recomendação são baseados em regras de associação ou palavras chave (quando o conteúdo é considerado). No entanto, quando os dados não são suficientes ou são muito dispersos e a ordem é considerada, uma abordagem tradicional pode ser inadequada. Por outro lado, assumindo outro paradigma, a área de Web Analytics, tem obtido um crescimento considerável, através de ferramentas robustas que permitem a recolha e análise de dados da internet, a fim de compreender e otimizar eficiência e eficácia do website. O presente artigo propõe o desenvolvimento de um sistema de recomendação baseado na ferramenta Google Analytics. O protótipo é composto por dois componentes principais que são: 1) um serviço responsável pela construção e lógica associada à criação das recomendações; 2) uma biblioteca incorporável em qualquer website que providenciará um widget de recomendação configurável. Avaliações preliminares constataram que a implementação segue a lógica do modelo proposto.
As the Internet usage keeps increasing, the number of web sites and hence the number of web pages also keeps increasing, so there is a need to align the user experience with the overall websites purposes. Toward this requirement, the proposed recommendation systems suggest the user pages that might be of its interest based on past navigation profiles of overall site usage. Most of existing recommendation systems are based on association rules or based on keywords (when content is considered). However, on usage data shortage or sparse data and if sequential order is to be considered such traditional approaches may become unsuitable. Conversely, the Web Analytics arena, assuming other paradigm, has experienced a considerable growth through mature tools that allow the collection and analysis of internet data in order to understand and optimize website efficiency and efficacy. This work proposes the development of a recommendation system based on the Google Analytics tool. The prototype is constituted by two main components which are: 1) a service responsible for the construction and associated logic that underlies recommendations generation; 2) an embeddable library on any website that will furnish website with a configurable recommendation widget. Preliminary evaluations had showed that the implementation follows the logic of the proposed model.
Kilic, Sefa. "Clustering Frequent Navigation Patterns From Website Logs Using Ontology And Temporal Information." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12613979/index.pdf.
Повний текст джерелаVlk, Vladimír. "Získávání znalostí z webových logů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236196.
Повний текст джерелаMair, Patrick, and Marcus Hudec. "Session Clustering Using Mixtures of Proportional Hazards Models." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 2008. http://epub.wu.ac.at/598/1/document.pdf.
Повний текст джерелаSeries: Research Report Series / Department of Statistics and Mathematics
Suleiman, Iyad. "Integrating data mining and social network techniques into the development of a Web-based adaptive play-based assessment tool for school readiness." Thesis, University of Bradford, 2013. http://hdl.handle.net/10454/7293.
Повний текст джерелаChen, Xiaowei. "Measurement, analysis and improvement of BitTorrent Darknets." HKBU Institutional Repository, 2013. http://repository.hkbu.edu.hk/etd_ra/1545.
Повний текст джерелаCalderón-Benavides, Liliana. "Unsupervised Identification of the User’s Query Intent in Web Search." Doctoral thesis, Universitat Pompeu Fabra, 2011. http://hdl.handle.net/10803/51299.
Повний текст джерелаEste trabajo doctoral se enfoca en identificar y entender las intenciones que motivan a los usuarios a realizar búsquedas en la Web a través de la aplicación de métodos de aprendizaje automático que no requieren datos adicionales más que las necesidades de información de los mismos usuarios, representadas a través de sus consultas. El conocimiento y la interpretación de esta información, de valor incalculable, puede ayudar a los sistemas de búsqueda Web a encontrar recursos particularmente relevantes y así mejorar la satisfacción de sus usuarios. A través del uso de técnicas de aprendizaje no supervisado, las cuales han sido seleccionadas dependiendo del contexto del problema a solucionar, y cuyos resultados han demostrado ser efectivos para cada uno de los problemas planteados, a lo largo de este trabajo se muestra que no solo es posible identificar las intenciones de los usuarios, sino que este es un proceso que se puede llevar a cabo de manera automática. La investigación desarrollada en esta tesis ha implicado un proceso evolutivo, el cual inicia con el análisis de la clasificación manual de diferentes conjuntos de consultas que usuarios reales han sometido a un motor de búsqueda. El trabajo pasa a través de la proposición de una nueva clasificación de las intenciones de consulta de usuarios, y el uso de diferentes técnicas de aprendizaje no supervisado para identificar dichas intenciones, llegando hasta establecer que éste no es un problema unidimensional, sino que debería ser considerado como un problema de múltiples dimensiones, donde cada una de dichas dimensiones, o facetas, contribuye a clarificar y establecer cuál es la intención del usuario. A partir de este último trabajo, hemos creado un modelo para la identificar la intención del usuario en un escenario on–line.
Song, Ge. "Méthodes parallèles pour le traitement des flux de données continus." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLC059/document.
Повний текст джерелаWe live in a world where a vast amount of data is being continuously generated. Data is coming in a variety of ways. For example, every time we do a search on Google, every time we purchase something on Amazon, every time we click a ‘like’ on Facebook, every time we upload an image on Instagram, every time a sensor is activated, etc., it will generate new data. Data is different than simple numerical information, it now comes in a variety of forms. However, isolated data is valueless. But when this huge amount of data is connected, it is very valuable to look for new insights. At the same time, data is time sensitive. The most accurate and effective way of describing data is to express it as a data stream. If the latest data is not promptly processed, the opportunity of having the most useful results will be missed.So a parallel and distributed system for processing large amount of data streams in real time has an important research value and a good application prospect. This thesis focuses on the study of parallel and continuous data stream Joins. We divide this problem into two categories. The first one is Data Driven Parallel and Continuous Join, and the second one is Query Driven Parallel and Continuous Join
Van, der Westhuizen Frederick Jacques. "Lifetime value modelling / Frederick Jacques van der Westhuizen." Thesis, North-West University, 2009. http://hdl.handle.net/10394/2521.
Повний текст джерелаThesis (M.Sc. (Computer Science))--North-West University, Vaal Triangle Campus, 2009.
Castellanos-Paez, Sandra. "Apprentissage de routines pour la prise de décision séquentielle." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM043.
Повний текст джерелаIntuitively, a system capable of exploiting its past experiences should be able to achieve better performance. One way to build on past experiences is to learn macros (i.e. routines). They can then be used to improve the performance of the solving process of new problems. In automated planning, the challenge remains on developing powerful planning techniques capable of effectively explore the search space that grows exponentially. Learning macros from previously acquired knowledge has proven to be beneficial for improving a planner's performance. This thesis contributes mainly to the field of automated planning, and it is more specifically related to learning macros for classical planning. We focused on developing a domain-independent learning framework that identifies sequences of actions (even non-adjacent) from past solution plans and selects the most useful routines (i.e. macros), based on a priori evaluation, to enhance the planning domain.First, we studied the possibility of using sequential pattern mining for extracting frequent sequences of actions from past solution plans, and the link between the frequency of a macro and its utility. We found out that the frequency alone may not provide a consistent selection of useful macro-actions (i.e. sequences of actions with constant objects).Second, we discussed the problem of learning macro-operators (i.e. sequences of actions with variable objects) by using classic pattern mining algorithms in planning. Despite the efforts, we find ourselves in a dead-end with the selection process because the pattern mining filtering structures are not adapted to planning.Finally, we provided a novel approach called METEOR, which ensures to find the frequent sequences of operators from a set of plans without a loss of information about their characteristics. This framework was conceived for mining macro-operators from past solution plans, and for selecting the optimal set of macro-operators that maximises the node gain. It has proven to successfully mine macro-operators of different lengths for four different benchmarks domains and thanks to the selection phase, be able to deliver a positive impact on the search time without drastically decreasing the quality of the plans
Malherbe, Emmanuel. "Standardization of textual data for comprehensive job market analysis." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLC058/document.
Повний текст джерелаWith so many job adverts and candidate profiles available online, the e-recruitment constitutes a rich object of study. All this information is however textual data, which from a computational point of view is unstructured. The large number and heterogeneity of recruitment websites also means that there is a lot of vocabularies and nomenclatures. One of the difficulties when dealing with this type of raw textual data is being able to grasp the concepts contained in it, which is the problem of standardization that is tackled in this thesis. The aim of standardization is to create a unified process providing values in a nomenclature. A nomenclature is by definition a finite set of meaningful concepts, which means that the attributes resulting from standardization are a structured representation of the information. Several questions are however raised: Are the websites' structured data usable for a unified standardization? What structure of nomenclature is the best suited for standardization, and how to leverage it? Is it possible to automatically build such a nomenclature from scratch, or to manage the standardization process without one? To illustrate the various obstacles of standardization, the examples we are going to study include the inference of the skills or the category of a job advert, or the level of training of a candidate profile. One of the challenges of e-recruitment is that the concepts are continuously evolving, which means that the standardization must be up-to-date with job market trends. In light of this, we will propose a set of machine learning models that require minimal supervision and can easily adapt to the evolution of the nomenclatures. The questions raised found partial answers using Case Based Reasoning, semi-supervised Learning-to-Rank, latent variable models, and leveraging the evolving sources of the semantic web and social media. The different models proposed have been tested on real-world data, before being implemented in a industrial environment. The resulting standardization is at the core of SmartSearch, a project which provides a comprehensive analysis of the job market
Klinczak, Marjori Naiele Mocelin. "Identificação e propagação de temas em redes sociais." Universidade Tecnológica Federal do Paraná, 2016. http://repositorio.utfpr.edu.br/jspui/handle/1/2304.
Повний текст джерелаRecent years have been marked by the emergence of various social media, from Orkut to Facebook, and Twitter, Youtube, Google+ and many others: each offers new features as a way to attract more users. These social media generate a large amount of data which is processed properly can be used to identify trends, patterns and changes. The objective of this work is the discovery of the key topics in a social network, characterized as relevant terms groupings, restricted to a particular context and the study of its evolution over time. For that will be used procedures based on Data Mining and Text Processing. At first techniques are used preprocessing of texts in order to identify the most relevant terms that appear in the text messages from the social network. Next are used grouping of classical algorithms - k-means, k-medoids, DBSCAN - and the recent NMF (Non-negative Matrix Factorization), to identify the main themes of these messages, characterized as relevant terms groupings. The proposal was evaluated on the Twitter network, using bases tweets considering different contexts. The results show the feasibility of the proposal and its application in the identification of relevant topics of this social network
Nguyen, Hoang Viet Tuan. "Prise en compte de la qualité des données lors de l’extraction et de la sélection d’évolutions dans les séries temporelles de champs de déplacements en imagerie satellitaire." Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAA011.
Повний текст джерелаThis PhD thesis deals with knowledge discovery from Displacement Field Time Series (DFTS) obtained by satellite imagery. Such series now occupy a central place in the study and monitoring of natural phenomena such as earthquakes, volcanic eruptions and glacier displacements. These series are indeed rich in both spatial and temporal information and can now be produced regularly at a lower cost thanks to spatial programs such as the European Copernicus program and its famous Sentinel satellites. Our proposals are based on the extraction of grouped frequent sequential patterns. These patterns, originally defined for the extraction of knowledge from Satellite Image Time Series (SITS), have shown their potential in early work to analyze a DFTS. Nevertheless, they cannot use the confidence indices coming along with DFTS and the swap method used to select the most promising patterns does not take into account their spatiotemporal complementarities, each pattern being evaluated individually. Our contribution is thus double. A first proposal aims to associate a measure of reliability with each pattern by using the confidence indices. This measure allows to select patterns having occurrences in the data that are on average sufficiently reliable. We propose a corresponding constraint-based extraction algorithm. It relies on an efficient search of the most reliable occurrences by dynamic programming and on a pruning of the search space provided by a partial push strategy. This new method has been implemented on the basis of the existing prototype SITS-P2miner, developed by the LISTIC and LIRIS laboratories to extract and rank grouped frequent sequential patterns. A second contribution for the selection of the most promising patterns is also made. This one, based on an informational criterion, makes it possible to take into account at the same time the confidence indices and the way the patterns complement each other spatially and temporally. For this aim, the confidence indices are interpreted as probabilities, and the DFTS are seen as probabilistic databases whose distributions are only partial. The informational gain associated with a pattern is then defined according to the ability of its occurrences to complete/refine the distributions characterizing the data. On this basis, a heuristic is proposed to select informative and complementary patterns. This method provides a set of weakly redundant patterns and therefore easier to interpret than those provided by swap randomization. It has been implemented in a dedicated prototype. Both proposals are evaluated quantitatively and qualitatively using a reference DFTS covering Greenland glaciers constructed from Landsat optical data. Another DFTS that we built from TerraSAR-X radar data covering the Mont-Blanc massif is also used. In addition to being constructed from different data and remote sensing techniques, these series differ drastically in terms of confidence indices, the series covering the Mont-Blanc massif being at very low levels of confidence. In both cases, the proposed methods operate under standard conditions of resource consumption (time, space), and experts’ knowledge of the studied areas is confirmed and completed
Aleksandrova, Marharyta. "Factorisation de matrices et analyse de contraste pour la recommandation." Thesis, Université de Lorraine, 2017. http://www.theses.fr/2017LORR0080/document.
Повний текст джерелаIn many application areas, data elements can be high-dimensional. This raises the problem of dimensionality reduction. The dimensionality reduction techniques can be classified based on their aim: dimensionality reduction for optimal data representation and dimensionality reduction for classification, as well as based on the adopted strategy: feature selection and feature extraction. The set of features resulting from feature extraction methods is usually uninterpretable. Thereby, the first scientific problematic of the thesis is how to extract interpretable latent features? The dimensionality reduction for classification aims to enhance the classification power of the selected subset of features. We see the development of the task of classification as the task of trigger factors identification that is identification of those factors that can influence the transfer of data elements from one class to another. The second scientific problematic of this thesis is how to automatically identify these trigger factors? We aim at solving both scientific problematics within the recommender systems application domain. We propose to interpret latent features for the matrix factorization-based recommender systems as real users. We design an algorithm for automatic identification of trigger factors based on the concepts of contrast analysis. Through experimental results, we show that the defined patterns indeed can be considered as trigger factors
Braik, William. "Détection d'évènements complexes dans les flux d'évènements massifs." Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0596/document.
Повний текст джерелаPattern detection over streams of events is gaining more and more attention, especially in the field of eCommerce. Our industrial partner Cdiscount, which is one of the largest eCommerce companies in France, aims to use pattern detection for real-time customer behavior analysis. The main challenges to consider are efficiency and scalability, as the detection of customer behaviors must be achieved within a few seconds, while millions of unique customers visit the website every day,thus producing a large event stream. In this thesis, we present Auros, a system for large-scale an defficient pattern detection for eCommerce. It relies on a domain-specific language to define behavior patterns. Patterns are then compiled into deterministic finite automata, which are run on a BigData streaming platform. Our evaluation shows that our approach is efficient and scalable, and fits the requirements of Cdiscount
Ng, Kwun-Keung. "An empirical study of web usage mining techniques." Thesis, 2002. http://spectrum.library.concordia.ca/1805/1/MQ72942.pdf.
Повний текст джерела"Web mining techniques for query log analysis and expertise retrieval." Thesis, 2009. http://library.cuhk.edu.hk/record=b6075418.
Повний текст джерелаThesis (Ph.D.)--Chinese University of Hong Kong, 2009.
Includes bibliographical references (leaves 156-175).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese.
Xu, Guandong. "Web mining techniques for recommendation and personalization." Thesis, 2008. https://vuir.vu.edu.au/1422/.
Повний текст джерелаXu, Guandong. "Web mining techniques for recommendation and personalization." 2008. http://eprints.vu.edu.au/1422/1/xu.pdf.
Повний текст джерелаCavalcanti, Fábio Torres. "Incremental mining techniques." Master's thesis, 2005. http://hdl.handle.net/1822/3965.
Повний текст джерелаThe increasing necessity of organizational data exploration and analysis, seeking new knowledge that may be implicit in their operational systems, has made the study of data mining techniques gain a huge impulse. This impulse can be clearly noticed in the e-commerce domain, where the analysis of client’s past behaviours is extremely valuable and may, eventually, bring up important working instruments for determining his future behaviour. Therefore, it is possible to predict what a Web site visitor might be looking for, and thus restructuring the Web site to meet his needs. Thereby, the visitor keeps longer navigating in the Web site, what increases his probability of getting attracted by some product, leading to its purchase. To achieve this goal, Web site adaptation has to be fast enough to change while the visitor navigates, and has also to ensure that this adaptation is made according to the most recent visitors’ navigation behaviour patterns, which requires a mining algorithm with a sufficiently good response time for frequently update the patterns. Typical databases are continuously changing over the time, what can invalidate some patterns or introduce new ones. Thus, conventional data mining techniques were proved to be inefficient, as they needed to re-execute to update the mining results with the ones derived from the last database changes. Incremental mining techniques emerged to avoid algorithm re-execution and to update mining results when incremental data are added or old data are removed, ensuring a better performance in the data mining processes. In this work, we analyze some existing incremental mining strategies and models, giving a particular emphasis in their application on Web sites, in order to develop models to discover Web user behaviour patterns and automatically generate some recommendations to restructure sites in useful time. For accomplishing this task, we designed and implemented Spottrigger, a system responsible for the whole data life cycle in a Web site restructuring work. This life cycle includes tasks specially oriented to extract the raw data stored in Web servers, pass these data by intermediate phases of cleansing and preparation, perform an incremental data mining technique to extract users’ navigation patterns and finally suggesting new locations of spots on the Web site according to the patterns found and the profile of the visitor. We applied Spottrigger in our case study, which was based on data gathered from a real online newspaper. Our main goal was to collect, in a useful time, information about users that at a given moment are consulting the site and thus restructuring the Web site in a short term, delivering the scheduled advertisements, activated according to the user’s profile. Basically, our idea is to have advertisements classified in levels and restructure the Web site to have the higher level advertisements in pages the visitor will most probably access. In order to do that, we construct a page ranking for the visitor, based on results obtained through the incremental mining technique. Since visitors’ navigation behaviour may change during time, the incremental mining algorithm will be responsible for catching this behaviour changes and fast update the patterns. Using Spottrigger as a decision support system for advertisement, a newspaper company may significantly improve the merchandising of its publicity spots guaranteeing that a given advertisement will reach to a higher number of visitors, even if they change their behaviour when visiting pages that were usually not visited.
A crescente necessidade de exploração e análise dos dados, na procura de novo conhecimento sobre o negócio de uma organização nos seus sistemas operacionais, tem feito o estudo das técnicas de mineração de dados ganhar um grande impulso. Este pode ser notado claramente no domínio do comércio electrónico, no qual a análise do comportamento passado dos clientes é extremamente valiosa e pode, eventualmente, fazer emergir novos elementos de trabalho, bastante válidos, para a determinação do seu comportamento no futuro. Desta forma, é possível prever aquilo que um visitante de um sítio Web pode andar à procura e, então, preparar esse sítio para atender melhor as suas necessidades. Desta forma, consegue-se fazer com que o visitante permaneça mais tempo a navegar por esse sítio o que aumenta naturalmente a possibilidade dele ser atraído por novos produtos e proceder, eventualmente, à sua aquisição. Para que este objectivo possa ser alcançado, a adaptação do sítio tem de ser suficientemente rápida para que possa acompanhar a navegação do visitante, ao mesmo tempo que assegura os mais recentes padrões de comportamento de navegação dos visitantes. Isto requer um algoritmo de mineração de dados com um nível de desempenho suficientemente bom para que se possa actualizar os padrões frequentemente. Com as constantes mudanças que ocorrem ao longo do tempo nas bases de dados, invalidando ou introduzindo novos padrões, as técnicas de mineração de dados convencionais provaram ser ineficientes, uma vez que necessitam de ser reexecutadas a fim de actualizar os resultados do processo de mineração com os dados subjacentes às modificações ocorridas na base de dados. As técnicas de mineração incremental surgiram com o intuito de evitar essa reexecução do algoritmo para actualizar os resultados da mineração quando novos dados (incrementais) são adicionados ou dados antigos são removidos. Assim, consegue-se assegurar uma maior eficiência aos processos de mineração de dados. Neste trabalho, analisamos algumas das diferentes estratégias e modelos para a mineração incremental de dados, dando-se particular ênfase à sua aplicação em sítios Web, visando desenvolver modelos para a descoberta de padrões de comportamento dos visitantes desses sítios e gerar automaticamente recomendações para a sua reestruturação em tempo útil. Para atingir esse objectivo projectámos e implementámos o sistema Spottrigger, que cobre todo o ciclo de vida do processo de reestruturação de um sítio Web. Este ciclo é composto, basicamente, por tarefas especialmente orientadas para a extracção de dados “crus” armazenados nos servidores Web, passar estes dados por fases intermédias de limpeza e preparação, executar uma técnica de mineração incremental para extrair padrões de navegação dos utilizadores e, finalmente, reestruturar o sítio Web de acordo com os padrões de navegação encontrados e com o perfil do próprio utilizador. Além disso, o sistema Spottrigger foi aplicado no nosso estudo de caso, o qual é baseado em dados reais provenientes de um jornal online. Nosso principal objectivo foi colectar, em tempo útil, alguma informação sobre o perfil dos utilizadores que num dado momento estão a consultar o sítio e, assim, fazer a reestruturação do sítio num período de tempo tão curto quanto o possível, exibindo os anúncios desejáveis, activados de acordo com o perfil do utilizador. Os anúncios do sistema estão classificados por níveis. Os sítios são reestruturados para que os anúncios de nível mais elevado sejam lançados nas páginas com maior probabilidade de serem visitadas. Nesse sentido, foi definida uma classificação das páginas para o utilizador, baseada nos padrões frequentes adquiridos através do processo de mineração incremental. Visto que o comportamento de navegação dos visitantes pode mudar ao longo do tempo, o algoritmo de mineração incremental será também responsável por capturar essas mudanças de comportamento e rapidamente actualizar os padrões. .
LIU, HUI-YU, and 劉慧瑜. "Application of Data Mining Techniques to a Web-Based Virtual Store." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/10108545208292454970.
Повний текст джерелаHuang, Hsing-Feng, and 黃星峯. "Using Data Mining Techniques to Build Adaptive E-Learning Web Site." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/61829979286470308750.
Повний текст джерела大同大學
資訊經營學系(所)
92
The majority of e-learning Web sites have predefined course frameworks. No matter who enters the Web site, almost the same link types are offered. The course materials are added when the time is prolonging. As a result, the learners can be lost in the intricate links of teaching materials. Conklin indicated that ‘disorientation’ and ‘cognitive overhead’ are the two prime issues in hypermedia documents [1]. Thus, this thesis research focuses on the development of an adaptive model by taking pre-learning test before enrolling in the course materials, recording the browsing behavior on the course materials, and taking post-learning exam at the end of learning the course materials. These data are assembled to set up a data warehouse. We use data mining techniques, classification and association, to analyze the collected data to set up group navigation model and personal navigation model. Finally, we use the group navigation model and personal navigation model to predict learner’s personal navigation pattern and give adaptive guidance to the learner.
Huang, Hsing-Feng, and 黃星峰. "Using Data Mining Techniques to Build Adaptive E-Learning Web Site." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/54807931352614411674.
Повний текст джерела大同大學
資訊經營研究所
92
The majority of e-learning Web sites have predefined course frameworks. No matter who enters the Web site, almost the same link types are offered. The course materials are added when the time is prolonging. As a result, the learners can be lost in the intricate links of teaching materials. Conklin indicated that ‘disorientation’ and ‘cognitive overhead’ are the two prime issues in hypermedia documents [1]. Thus, this thesis research focuses on the development of an adaptive model by taking pre-learning test before enrolling in the course materials, recording the browsing behavior on the course materials, and taking post-learning exam at the end of learning the course materials. These data are assembled to set up a data warehouse. We use data mining techniques, classification and association, to analyze the collected data to set up group navigation model and personal navigation model. Finally, we use the group navigation model and personal navigation model to predict learner’s personal navigation pattern and give adaptive guidance to the learner.
Hsiao, Ming-Chuan, and 蕭明傳. "Applying Data Mining Techniques in a Web-Based Programming Languages Learning Environment." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/49026280027196878324.
Повний текст джерела雲林科技大學
資訊管理系碩士班
96
Web-based learning systems accumulate a vast amount of information which is very valuable for analyzing students’ behavior. These systems record their interactions with students and students’ study status. We use association rule in data mining to analyze a web-based programming learning environment and identify pattern between concepts of this course by illustrating student’s behavior in exercise. Programming learning focus on the training of implementation, so we classify students’ behavior in exercise to four status: fast, slow, finish after class, and failed to finish, and find the association among all questions in different chapters. Then, we use clustering method to separate questions into many clusters with two attribute: finish rate in class, and average compile count. Finally, view the association rules in each cluster to find patterns interest us. for example, in the high finish rate cluster, what questions with bad status will baffle students’ study in another chapter? In the low finish rate cluster, what questions with good status will help students understand in another questions. Because many teaching material of web-based learning are designed by tutors, we aim to represent the associations hidden between concepts. Using students’ behavior to analyze can understand students’ learning status and help tutors to improve their tuition.
黃釗田. "The Study in TVE Course Querying Web Site Managed by Data Mining Techniques." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/46004054249962008600.
Повний текст джерела國立臺灣師範大學
工業教育研究所
88
The study in Data Mining is focused on mining a lot of information and analyzing them with the help of technology, so as to find out the unknown and hidden data which may be very useful. Hence, Data Mining is based on the development in data-base field. Before Data Mining is done, a good management for some items has to be made first, for example, design, type … , etc. This thesis uses the data-base management method in Data Mining to set up the studying construction on the management for TVE course querying web site through information discussing and the web site system. Further more, Data Mining is used in TVE course information and the final analysis result can be taken for reference in the management and construction of TVE course querying web site. By now, almost all TVE schools have set up their own web sites. But they are weak in the integration among every school’s course data-base. People often need to spend much time analyzing and designing the course web site system construction and these results can not meet many people’s requests wholly in searching the course information. The study takes it as the starting point that it can meet users’ requests in querying TVE courses and in constructing the data warehouse of education in the future. It matches users to the thinking of customers’ shopping. Moreover, it uses the idea of data-base management in Data Mining to analyze the “TVE course querying web site” and to build the matching analysis modes of users to customers, TVE courses to goods and schools (teachers) to goods names. Finally, it uses methods of prediction and analysis to find out the amount of TVE course querying web site users, so as to provide references to the system designers in managing and setting up the TVE course querying web site, so that the management for web site can meet every user’s request.
Hsieh, Yu-Chun, and 謝宇俊. "Using Data Mining Techniques in Analyzing Patient’s Properties for the High Usage of Medical Resources." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/52668644629159599434.
Повний текст джерела國立台北護理學院
資訊管理研究所
96
The Taiwan’s National Health Insurance System founded since 1995. Nowadays, this system with some problems also occurred in Germany and Canada, for example, shortage of funding, abuse of medical resources, etc. However, the abuse of medical resources is the most serious problem. According to the recent report provided by the Bureau of National Health Insurance, the frequency of visiting hospital for Taiwan’s patient was about 15 times per year. Comparing to other country, the frequency was from 4 to 7 times per year, this frequency of Taiwan’s patient is much higher. This situation indeed revealed the problems of the wasted medical resources. This study tries to build proper models for analyzing patient’s behavior of visiting hospital, and find out the corresponding characteristics of patients. Then, it is possible to realize the reasons for the abuse of medical resources. In order to verify the proposed data mining models, this study used a sampling database as the experimental data set that provided by NHIRD (National Health Insurane Research Database). Besides, by using techniques of data warehouse, clustering, neural network and association rules, we proposed a conceptual data model that can analyze the visiting behavior of patients, and find out the profiles for each kind of visiting behavior of patients. The major ideas of the proposed method are three folds. First, we used data warehouse to build a summarized dataset from NHRID that base on year, disease and outpatnent/inpatient three dimensions. Next, we used self-organizing map (SOM) as a tool to classify patients into affinity clusters, the key variables used to classify patients are “number of hospital visiting flag”, “season III behavior variable” and “the amount of spend money”. Finally, we used association rule to filter out the profiles of each patient group of high usage of medical resources. We hope the research results are useful to realize the behavior of high visiting hospital and prevent the abuse of medical resources.
"Improving opinion mining with feature-opinion association and human computation." 2009. http://library.cuhk.edu.hk/record=b5894009.
Повний текст джерелаThesis (M.Phil.)--Chinese University of Hong Kong, 2009.
Includes bibliographical references (leaves [101]-113).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgement --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Major Topic --- p.1
Chapter 1.1.1 --- Opinion Mining --- p.1
Chapter 1.1.2 --- Human Computation --- p.2
Chapter 1.2 --- Major Work and Contributions --- p.3
Chapter 1.3 --- Thesis Outline --- p.4
Chapter 2 --- Literature Review --- p.6
Chapter 2.1 --- Opinion Mining --- p.6
Chapter 2.1.1 --- Feature Extraction --- p.6
Chapter 2.1.2 --- Sentiment Analysis --- p.9
Chapter 2.2 --- Social Computing --- p.15
Chapter 2.2.1 --- Social Bookmarking --- p.15
Chapter 2.2.2 --- Social Games --- p.18
Chapter 3 --- Feature-Opinion Association for Sentiment Analysis --- p.25
Chapter 3.1 --- Motivation --- p.25
Chapter 3.2 --- Problem Definition --- p.27
Chapter 3.2.1 --- Definitions --- p.27
Chapter 3.3 --- Closer look at the problem --- p.28
Chapter 3.3.1 --- Discussion --- p.29
Chapter 3.4 --- Proposed Approach --- p.29
Chapter 3.4.1 --- Nearest Opinion Word (DIST) --- p.31
Chapter 3.4.2 --- Co-Occurrence Frequency (COF) --- p.31
Chapter 3.4.3 --- Co-Occurrence Ratio (COR) --- p.32
Chapter 3.4.4 --- Likelihood-Ratio Test (LHR) --- p.32
Chapter 3.4.5 --- Combined Method --- p.34
Chapter 3.4.6 --- Feature-Opinion Association Algorithm --- p.35
Chapter 3.4.7 --- Sentiment Lexicon Expansion --- p.36
Chapter 3.5 --- Evaluation --- p.37
Chapter 3.5.1 --- Corpus Data Set --- p.37
Chapter 3.5.2 --- Test Data set --- p.37
Chapter 3.5.3 --- Feature-Opinion Association Accuracy --- p.38
Chapter 3.6 --- Summary --- p.45
Chapter 4 --- Social Game for Opinion Mining --- p.46
Chapter 4.1 --- Motivation --- p.46
Chapter 4.2 --- Social Game Model --- p.47
Chapter 4.2.1 --- Definitions --- p.48
Chapter 4.2.2 --- Social Game Problem --- p.51
Chapter 4.2.3 --- Social Game Flow --- p.51
Chapter 4.2.4 --- Answer Extraction Procedure --- p.52
Chapter 4.3 --- Social Game Properties --- p.53
Chapter 4.3.1 --- Type of Information --- p.53
Chapter 4.3.2 --- Game Structure --- p.55
Chapter 4.3.3 --- Verification Method --- p.59
Chapter 4.3.4 --- Game Mechanism --- p.60
Chapter 4.3.5 --- Player Requirement --- p.62
Chapter 4.4 --- Design Guideline --- p.63
Chapter 4.5 --- Opinion Mining Game Design --- p.65
Chapter 4.5.1 --- OpinionMatch --- p.65
Chapter 4.5.2 --- FeatureGuess --- p.68
Chapter 4.6 --- Summary --- p.71
Chapter 5 --- Tag Sentiment Analysis for Social Bookmark Recommendation System --- p.72
Chapter 5.1 --- Motivation --- p.72
Chapter 5.2 --- Problem Statement --- p.74
Chapter 5.2.1 --- Social Bookmarking Model --- p.74
Chapter 5.2.2 --- Social Bookmark Recommendation (SBR) Problem --- p.75
Chapter 5.3 --- Proposed Approach --- p.75
Chapter 5.3.1 --- Social Bookmark Recommendation Framework --- p.75
Chapter 5.3.2 --- Subjective Tag Detection (STD) --- p.77
Chapter 5.3.3 --- Similarity Matrices --- p.80
Chapter 5.3.4 --- User-Website matrix: --- p.81
Chapter 5.3.5 --- User-Tag matrix --- p.81
Chapter 5.3.6 --- Website-Tag matrix --- p.82
Chapter 5.4 --- Pearson Correlation Coefficient --- p.82
Chapter 5.5 --- Social Network-based User Similarity --- p.83
Chapter 5.6 --- User-oriented Website Ranking --- p.85
Chapter 5.7 --- Evaluation --- p.87
Chapter 5.7.1 --- Bookmark Data --- p.87
Chapter 5.7.2 --- Social Network --- p.87
Chapter 5.7.3 --- Subjective Tag List --- p.87
Chapter 5.7.4 --- Subjective Tag Detection --- p.88
Chapter 5.7.5 --- Bookmark Recommendation Quality --- p.90
Chapter 5.7.6 --- System Evaluation --- p.91
Chapter 5.8 --- Summary --- p.93
Chapter 6 --- Conclusion and Future Work --- p.94
Chapter A --- List of Symbols and Notations --- p.97
Chapter B --- List of Publications --- p.100
Bibliography --- p.101
吳保珠. "Using Data Mining Techniques to Discover the Most Adaptive Web Paths on Learning Website." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/42952731174483963154.
Повний текст джерела南台科技大學
資訊管理系
92
As the Internet technology become mature and the network infrastructure has been built rapidly, the World Wide Web has accumulated a huge amount of information and driven a trans-formation to an Information-based society. Nowadays, Online-learning Websites become popular and disseminated in the World Wide Web. More and more Students browse or use the search device to learn new knowledge. With the explosive widespread use of World Wide Web, information on the Websites has growing at an amazing speed, making it increasingly difficult to search for information. However, World Wide Web does not guarantee to provide an efficient learning environment. The searching information on the Websites becomes inefficient. There is a strong drive fore for the Web’s planner to analyze how to allow the user to efficiently access the information. For the time being, most E-learning Websites have cached the elapsed browsing-time and visited-web pages. Unfortunately, no further analysis on the behavior of users was provided. This thesis will use data-mining technology to analyze this information and to excavate the most adaptive learning paths for students. In this thesis, we use two mining methods to discover the most adaptive web paths of students on learning website. First, we efficiently mine association rules which contain the mined web data of students. We can find the most adaptive web paths according the association rules. Moreover, we develop an efficient sequential pattern algorithm to mine maximal large sequences which contain the mined web sequence of students. We can find the most adaptive web paths according to the maximal large sequences.