Добірка наукової літератури з теми "Web document clustering (WDC)"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Web document clustering (WDC)".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Web document clustering (WDC)"
Im, Yeong-Hui. "A Post Web Document Clustering Algorithm." KIPS Transactions:PartB 9B, no. 1 (February 1, 2002): 7–16. http://dx.doi.org/10.3745/kipstb.2002.9b.1.007.
Повний текст джерелаHe, Xiaofeng, Hongyuan Zha, Chris H.Q. Ding, and Horst D. Simon. "Web document clustering using hyperlink structures." Computational Statistics & Data Analysis 41, no. 1 (November 2002): 19–45. http://dx.doi.org/10.1016/s0167-9473(02)00070-1.
Повний текст джерелаHammouda, K. M., and M. S. Kamel. "Efficient phrase-based document indexing for Web document clustering." IEEE Transactions on Knowledge and Data Engineering 16, no. 10 (October 2004): 1279–96. http://dx.doi.org/10.1109/tkde.2004.58.
Повний текст джерелаRani Manukonda, Sumathi, Asst Prof Kmit, Narayanguda ., Hyderabad ., Nomula Divya, Asst Prof Cmrit, Medchal ., and Hyderabad . "Efficient Document Clustering for Web Search Result." International Journal of Engineering & Technology 7, no. 3.3 (June 21, 2018): 90. http://dx.doi.org/10.14419/ijet.v7i3.3.14494.
Повний текст джерелаCreţulescu, Radu G., Daniel I. Morariu, Macarie Breazu, and Daniel Volovici. "DBSCAN Algorithm for Document Clustering." International Journal of Advanced Statistics and IT&C for Economics and Life Sciences 9, no. 1 (June 1, 2019): 58–66. http://dx.doi.org/10.2478/ijasitels-2019-0007.
Повний текст джерелаShen Huang, Zheng Chen, Yong Yu, and Wei-Ying Ma. "Multitype features coselection for Web document clustering." IEEE Transactions on Knowledge and Data Engineering 18, no. 4 (April 2006): 448–59. http://dx.doi.org/10.1109/tkde.2006.1599384.
Повний текст джерелаChan, Samuel W. K., and Mickey W. C. Chong. "Unsupervised clustering for nontextual web document classification." Decision Support Systems 37, no. 3 (June 2004): 377–96. http://dx.doi.org/10.1016/s0167-9236(03)00035-6.
Повний текст джерелаBoley, Daniel, Maria Gini, Robert Gross, Eui-Hong (Sam) Han, Kyle Hastings, George Karypis, Vipin Kumar, Bamshad Mobasher, and Jerome Moore. "Partitioning-based clustering for Web document categorization." Decision Support Systems 27, no. 3 (December 1999): 329–41. http://dx.doi.org/10.1016/s0167-9236(99)00055-x.
Повний текст джерелаSu, Zhong, Qiang Yang, Hongjiang Zhang, Xiaowei Xu, Yu-Hen Hu, and Shaoping Ma. "Correlation-Based Web Document Clustering for Adaptive Web Interface Design." Knowledge and Information Systems 4, no. 2 (April 2002): 151–67. http://dx.doi.org/10.1007/s101150200002.
Повний текст джерелаChawla, Suruchi. "Application of Convolution Neural Networks in Web Search Log Mining for Effective Web Document Clustering." International Journal of Information Retrieval Research 12, no. 1 (January 2022): 1–14. http://dx.doi.org/10.4018/ijirr.300367.
Повний текст джерелаДисертації з теми "Web document clustering (WDC)"
Coquet, Jean. "Étude exhaustive de voies de signalisation de grande taille par clustering des trajectoires et caractérisation par analyse sémantique." Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S073/document.
Повний текст джерелаSignaling pathways describe the extern stimuli responses of a cell. They are indispensable in biological processes such as differentiation, proliferation or apoptosis. The Systems Biology tries to study exhaustively the signalling pathways using static or dynamic models. The number of solutions which explain a biological phenomenon (for example the stimulus reaction of cell) can be very high in large models. First, this thesis proposes some different strategies to group the solutions describing the stimulus signalling with clustering methods and Formal Concept Analysis. Then, it presents the cluster characterization with semantic web methods. Those strategies have been applied to the TGF-beta signaling network, an extracellular stimulus playing an important role in the cancer growing, which helped to identify 5 large groups of trajectories characterized by different biological processes. Next, this thesis confronts the problem of heterogeneous data translation from different bases to a unique formalism. The goal is to be able to generalize the previous study. It proposes a strategy to group signaling pathways of a database to an unique model, then to calculate every signaling trajectory of the stimulus
Roussinov, Dmitri G., and Hsinchun Chen. "Document clustering for electronic meetings: an experimental comparison of two techniques." Elsevier, 1999. http://hdl.handle.net/10150/105091.
Повний текст джерелаIn this article, we report our implementation and comparison of two text clustering techniques. One is based on Wardâ s clustering and the other on Kohonenâ s Self-organizing Maps. We have evaluated how closely clusters produced by a computer resemble those created by human experts. We have also measured the time that it takes for an expert to â â clean upâ â the automatically produced clusters. The technique based on Wardâ s clustering was found to be more precise. Both techniques have worked equally well in detecting associations between text documents. We used text messages obtained from group brainstorming meetings.
Kellou-Menouer, Kenza. "Découverte de schéma pour les données du Web sémantique." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLV047/document.
Повний текст джерелаAn increasing number of linked data sources are published on the Web. However, their schema may be incomplete or missing. In addition, data do not necessarily follow their schema. This flexibility for describing the data eases their evolution, but makes their exploitation more complex. In our work, we have proposed an automatic and incremental approach enabling schema discovery from the implicit structure of the data. To complement the description of the types in a schema, we have also proposed an approach for finding the possible versions (patterns) for each of them. It proceeds online without having to download or browse the source. This can be expensive or even impossible because the sources may have some access limitations, either on the query execution time, or on the number of queries.We have also addressed the problem of annotating the types in a schema, which consists in finding a set of labels capturing their meaning. We have proposed annotation algorithms which provide meaningful labels using external knowledge bases. Our approach can be used to find meaningful type labels during schema discovery, and also to enrichthe description of existing types.Finally, we have proposed an approach to evaluate the gap between a data source and itsschema. To this end, we have proposed a setof quality factors and the associated metrics, aswell as a schema extension allowing to reflect the heterogeneity among instances of the sametype. Both factors and schema extension are used to analyze and improve the conformity between a schema and the instances it describes
Zanghi, Hugo. "Approches modèles pour la structuration du web vu comme un graphe." Thesis, Evry-Val d'Essonne, 2010. http://www.theses.fr/2010EVRY0041/document.
Повний текст джерелаHe statistical analysis of complex networks is a challenging task, given that appropriate statistical models and efficient computational procedures are required in order for structures to be learned. The principle of these models is to assume that the distribution of the edge values follows a parametric distribution, conditionally on a latent structure which is used to detect connectivity patterns. However, these methods suffer from relatively slow estimation procedures, since dependencies are complex. In this thesis we adapt online estimation strategies, originally developed for the EM algorithm, to the case of graph models. In addition to the network data used in the methods mentioned above, vertex content will sometimes be available. We then propose algorithms for clustering data sets that can be modeled with a graph structure embedding vertex features. Finally, an online Web application, based on the Exalead search engine, allows to promote certain aspects of this thesis
Qumsiyeh, Rani Majed. "Easy to Find: Creating Query-Based Multi-Document Summaries to Enhance Web Search." BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2713.
Повний текст джерелаSaoud, Zohra. "Approche robuste pour l’évaluation de la confiance des ressources sur le Web." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE1331/document.
Повний текст джерелаThis thesis in Computer Science is part of the trust management field and more specifically recommendation systems. These systems are usually based on users’ experiences (i.e., qualitative / quantitative) interacting with Web resources (eg. Movies, videos and Web services). Recommender systems are undermined by three types of uncertainty that raise due to users’ ratings and identities that can be questioned and also due to variations in Web resources performance at run-time. We propose a robust approach for trust assessment under these uncertainties. The first type of uncertainty refers to users’ ratings. This uncertainty stems from the vulnerability of the system in the presence of malicious users providing false ratings. To tackle this uncertainty, we propose a fuzzy model for users’ credibility. This model uses a fuzzy clustering technique to distinguish between malicious users and strict users usually excluded in existing approaches. The second type of uncertainty refers to user’s identity. Indeed, a malicious user purposely creates virtual identities to provide false ratings. To tackle this type of attack known as Sybil, we propose a ratings filtering model based on the users’ credibility and the trust graph to which they belong. We propose two mechanisms, one for assigning capacities to users and the second one is for selecting users whose ratings will be retained when evaluating trust. The first mechanism reduces the attack capacity of Sybil users. The second mechanism chose paths in the trust graph including trusted users with maximum capacities. Both mechanisms use users’ credibility as heuristic. To deal with the uncertainty over the capacity of a Web resource in satisfying users’ requests, we propose two approaches for Web resources trust assessment, one deterministic and one probabilistic. The first consolidates users’ ratings taking into account users credibility values. The second relies on probability theory coupled with possible worlds semantics. Probabilistic databases offer a better representation of the uncertainty underlying users’ credibility and also permit an uncertain assessment of resources trust. Finally, we develop the system WRTrust (Web Resource Trust) implementing our trust assessment approach. We carried out several experiments to evaluate the performance and robustness of our system. The results show that trust quality has been significantly improved, as well as the system’s robustness in presence of false ratings attacks and Sybil attacks
Ghenname, Mérième. "Le web social et le web sémantique pour la recommandation de ressources pédagogiques." Thesis, Saint-Etienne, 2015. http://www.theses.fr/2015STET4015/document.
Повний текст джерелаThis work has been jointly supervised by U. Jean Monnet Saint Etienne, in the Hubert Curien Lab (Frederique Laforest, Christophe Gravier, Julien Subercaze) and U. Mohamed V Rabat, LeRMA ENSIAS (Rachida Ahjoun, Mounia Abik). Knowledge, education and learning are major concerns in today’s society. The technologies for human learning aim to promote, stimulate, support and validate the learning process. Our approach explores the opportunities raised by mixing the Social Web and the Semantic Web technologies for e-learning. More precisely, we work on discovering learners profiles from their activities on the social web. The Social Web can be a source of information, as it involves users in the information world and gives them the ability to participate in the construction and dissemination of knowledge. We focused our attention on tracking the different types of contributions, activities and conversations in learners spontaneous collaborative activities on social networks. The learner profile is not only based on the knowledge extracted from his/her activities on the e-learning system, but also from his/her many activities on social networks. We propose a methodology for exploiting hashtags contained in users’ writings for the automatic generation of learner’s semantic profiles. Hashtags require some processing before being source of knowledge on the user interests. We have defined a method to identify semantics of hashtags and semantic relationships between the meanings of different hashtags. By the way, we have defined the concept of Folksionary, as a hashtags dictionary that for each hashtag clusters its definitions into meanings. Semantized hashtags are thus used to feed the learner’s profile so as to personalize recommendations on learning material. The goal is to build a semantic representation of the activities and interests of learners on social networks in order to enrich their profiles. We also discuss our recommendation approach based on three types of filtering (personalized, social, and statistical interactions with the system). We focus on personalized recommendation of pedagogical resources to the learner according to his/her expectations and profile
Luu, Vinh Trung. "Using event sequence alignment to automatically segment web users for prediction and recommendation." Thesis, Mulhouse, 2016. http://www.theses.fr/2016MULH0098/document.
Повний текст джерелаThis thesis explored the application of sequence alignment in web usage mining, including user clustering and web prediction and recommendation.This topic was chosen as the online business has rapidly developed and gathered a huge volume of information and the use of sequence alignment in the field is still limited. In this context, researchers are required to build up models that rely on sequence alignment methods and to empirically assess their relevance in user behavioral mining. This thesis presents a novel methodological point of view in the area and show applicable approaches in our quest to improve previous related work. Web usage behavior analysis has been central in a large number of investigations in order to maintain the relation between users and web services. Useful information extraction has been addressed by web content providers to understand users’ need, so that their content can be correspondingly adapted. One of the promising approaches to reach this target is pattern discovery using clustering, which groups users who show similar behavioral characteristics. Our research goal is to perform users clustering, in real time, based on their session similarity
Anderson, James D. "Interactive Visualization of Search Results of Large Document Sets." Wright State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=wright1547048073451373.
Повний текст джерелаAttiaoui, Dorra. "Belief detection and temporal analysis of experts in question answering communities : case strudy on stack overflow." Thesis, Rennes 1, 2017. http://www.theses.fr/2017REN1S085/document.
Повний текст джерелаDuring the last decade, people have changed the way they seek information online. Between question answering communities, specialized websites, social networks, the Web has become one of the most widespread platforms for information exchange and retrieval. Question answering communities provide an easy and quick way to search for information needed in any topic. The user has to only ask a question and wait for the other members of the community to respond. Any person posting a question intends to have accurate and helpful answers. Within these platforms, we want to find experts. They are key users that share their knowledge with the other members of the community. Expert detection in question answering communities has become important for several reasons such as providing high quality content, getting valuable answers, etc. In this thesis, we are interested in proposing a general measure of expertise based on the theory of belief functions. Also called the mathematical theory of evidence, it is one of the most well known approaches for reasoning under uncertainty. In order to identify experts among other users in the community, we have focused on finding the most important features that describe every individual. Next, we have developed a model founded on the theory of belief functions to estimate the general expertise of the contributors. This measure will allow us to classify users and detect the most knowledgeable persons. Therefore, once this metric defined, we look at the temporal evolution of users' behavior over time. We propose an analysis of users activity for several months in community. For this temporal investigation, we will describe how do users evolve during their time spent within the platform. Besides, we are also interested on detecting potential experts during the beginning of their activity. The effectiveness of these approaches is evaluated on real data provided from Stack Overflow
Книги з теми "Web document clustering (WDC)"
Prabhakar, Raghavan, and Schütze Hinrich, eds. Introduction to information retrieval. New York: Cambridge University Press, 2008.
Знайти повний текст джерелаManning, Christopher D., Hinrich Schütze, and Prabhakar Raghavan. Introduction to Information Retrieval. Cambridge University Press, 2008.
Знайти повний текст джерелаIntroduction to Information Retrieval. Cambridge University Press, 2008.
Знайти повний текст джерелаManning, Christopher D. Introduction to Information Retrieval. Cambridge University Press, 2008.
Знайти повний текст джерелаManning, Christopher D., Hinrich Schütze, and Prabhakar Raghavan. Introduction to Information Retrieval. Cambridge University Press, 2012.
Знайти повний текст джерелаЧастини книг з теми "Web document clustering (WDC)"
Schenker, Adam, Mark Last, Horst Bunke, and Abraham Kandel. "Graph Representations for Web Document Clustering." In Pattern Recognition and Image Analysis, 935–42. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003. http://dx.doi.org/10.1007/978-3-540-44871-6_108.
Повний текст джерелаQian, Tieyun, Jianfeng Si, Qing Li, and Qian Yu. "Leveraging Network Structure for Incremental Document Clustering." In Web Technologies and Applications, 342–53. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-29253-8_29.
Повний текст джерелаHuang, Shen, Gui-Rong Xue, Ben-Yu Zhang, Zheng Chen, Yong Yu, and Wei-Ying Ma. "Multi-type Features Based Web Document Clustering." In Web Information Systems – WISE 2004, 253–65. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-30480-7_27.
Повний текст джерелаWong, Wai-chiu, and Ada Wai-chee Fu. "Incremental Document Clustering for Web Page Classification." In Enabling Society with Information Technology, 101–10. Tokyo: Springer Japan, 2002. http://dx.doi.org/10.1007/978-4-431-66979-1_10.
Повний текст джерелаOikonomakou, N., and M. Vazirgiannis. "A Review of Web Document Clustering Approaches." In Text Mining and its Applications, 65–79. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-45219-5_6.
Повний текст джерелаOikonomakou, Nora, and Michalis Vazirgiannis. "A Review of Web Document Clustering Approaches." In Data Mining and Knowledge Discovery Handbook, 931–48. Boston, MA: Springer US, 2009. http://dx.doi.org/10.1007/978-0-387-09823-4_48.
Повний текст джерелаWei, Yang, Jinmao Wei, and Zhenglu Yang. "Extended Strategies for Document Clustering with Word Co-occurrences." In Web Technologies and Applications, 461–72. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-25255-1_38.
Повний текст джерелаSingh, Amit Prakash, Shalini Srivastava, and Sanjib Kumar Sahu. "Phrase Based Web Document Clustering: An Indexing Approach." In Lecture Notes in Networks and Systems, 481–92. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-3226-4_49.
Повний текст джерелаLi, Peng, Bin Wang, Wei Jin, and Yachao Cui. "User-Related Tag Expansion for Web Document Clustering." In Lecture Notes in Computer Science, 19–31. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-20161-5_5.
Повний текст джерелаZaw, Moe Moe, and Ei Ei Mon. "Web Document Clustering by Using PSO-Based Cuckoo Search Clustering Algorithm." In Studies in Computational Intelligence, 263–81. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-13826-8_14.
Повний текст джерелаТези доповідей конференцій з теми "Web document clustering (WDC)"
Han, Juhyun, Taehwan Kim, and Joongmin Choi. "Web Document Clustering by Using Automatic Keyphrase Extraction." In 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops. IEEE, 2007. http://dx.doi.org/10.1109/wi-iatw.2007.46.
Повний текст джерелаHan, Juhyun, Taehwan Kim, and Joongmin Choi. "Web Document Clustering by Using Automatic Keyphrase Extraction." In 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops. IEEE, 2007. http://dx.doi.org/10.1109/wiiatw.2007.4427539.
Повний текст джерелаYang, Yu-Jiu, and Bao-Gang Hu. "Pairwise Constraints-Guided Non-negative Matrix Factorization for Document Clustering." In IEEE/WIC/ACM International Conference on Web Intelligence (WI'07). IEEE, 2007. http://dx.doi.org/10.1109/wi.2007.66.
Повний текст джерелаZhou, X. F., J. G. Liang, Y. Hu, and L. Guo. "Text Document Latent Subspace Clustering by PLSA Factors." In 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT). IEEE, 2014. http://dx.doi.org/10.1109/wi-iat.2014.131.
Повний текст джерелаAliguliyev, Ramiz. "A Novel Partitioning-Based Clustering Method and Generic Document Summarization." In 2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops. IEEE, 2006. http://dx.doi.org/10.1109/wi-iatw.2006.16.
Повний текст джерелаZhao, Weizhong, Qing He, Huifang Ma, and Zhongzhi Shi. "Active Learning of Instance-Level Constraints for Semi-supervised Document Clustering." In 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology. IEEE, 2009. http://dx.doi.org/10.1109/wi-iat.2009.45.
Повний текст джерелаZamir, Oren, and Oren Etzioni. "Web document clustering." In the 21st annual international ACM SIGIR conference. New York, New York, USA: ACM Press, 1998. http://dx.doi.org/10.1145/290941.290956.
Повний текст джерелаMomin, B. F., P. J. Kulkarni, and Amol Chaudhari. "Web Document Clustering Using Document Index Graph." In 2006 International Conference on Advanced Computing and Communications. IEEE, 2006. http://dx.doi.org/10.1109/adcom.2006.4289851.
Повний текст джерелаTekir, Selma, Florian Mansmann, and Daniel Keim. "Geodesic distances for web document clustering." In 2011 Ieee Symposium On Computational Intelligence And Data Mining - Part Of 17273 - 2011 Ssci. IEEE, 2011. http://dx.doi.org/10.1109/cidm.2011.5949449.
Повний текст джерелаLiu, Debao, Dan Yang, Tiezheng Nie, Yue Kou, and Derong Shen. "Document Clustering in Personal Dataspace." In 2010 7th Web Information Systems and Applications Conference (WISA). IEEE, 2010. http://dx.doi.org/10.1109/wisa.2010.16.
Повний текст джерелаЗвіти організацій з теми "Web document clustering (WDC)"
He, Xiaofeng, Hongyuan Zha, Chris H. Q. Ding, and Horst D. Simon. Web document clustering using hyperlink structures. Office of Scientific and Technical Information (OSTI), May 2001. http://dx.doi.org/10.2172/815474.
Повний текст джерела