Дисертації з теми "Methods of text mining"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Methods of text mining".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Johnson, Eamon B. "Methods in Text Mining for Diagnostic Radiology." Case Western Reserve University School of Graduate Studies / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1459514073.
Повний текст джерелаEales, James Matthew. "Text-mining of experimental methods in phylogenetics." Thesis, University of Manchester, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.529251.
Повний текст джерелаAshton, Triss A. "Accuracy and Interpretability Testing of Text Mining Methods." Thesis, University of North Texas, 2013. https://digital.library.unt.edu/ark:/67531/metadc283791/.
Повний текст джерелаZakaria, Suliman Zubi. "Retrieving Electronic Data Interchange (EDI) Dataset using Text Mining Methods." Thesis, Сумський державний університет, 2012. http://essuir.sumdu.edu.ua/handle/123456789/28658.
Повний текст джерелаBhattacharya, Sanmitra. "Computational methods for mining health communications in web 2.0." Diss., University of Iowa, 2014. https://ir.uiowa.edu/etd/4576.
Повний текст джерелаZhang, Xiaodan Hu Xiaohua. "Exploiting external/domain knowledge to enhance traditional text mining using graph-based methods /." Philadelphia, Pa. : Drexel University, 2009. http://hdl.handle.net/1860/3076.
Повний текст джерелаDavis, Aaron Samuel. "Bisecting Document Clustering Using Model-Based Methods." BYU ScholarsArchive, 2009. https://scholarsarchive.byu.edu/etd/1938.
Повний текст джерелаBoynukalin, Zeynep. "Emotion Analysis Of Turkish Texts By Using Machine Learning Methods." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614521/index.pdf.
Повний текст джерелаs research fields. The aim is to develop a machine that can detect type of user&rsquo
s emotion from his/her text. Emotion classification of English texts is studied by several researchers and promising results are achieved. In this thesis, an emotion classification study on Turkish texts is introduced. To the best of our knowledge, this is the first study on emotion analysis of Turkish texts. In English there exists some well-defined datasets for the purpose of emotion classification, but we could not find datasets in Turkish suitable for this study. Therefore, another important contribution is the generating a new data set in Turkish for emotion analysis. The dataset is generated by combining two types of sources. Several classification algorithms are applied on the dataset and results are compared. Due to the nature of Turkish language, new features are added to the existing methods to improve the success of the proposed method.
Palma, Michael, and Shidi Zhou. "A Web Scraper For Forums : Navigation and text extraction methods." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-219903.
Повний текст джерелаWebforum är ett populärt sätt att utbyta information och diskutera olika ämnen. Dessa webbplatser har vanligtvis en särskild struktur, uppdelad i startsida, trådar och inlägg. Även om strukturen kan vara konsekvent bland olika forum är layouten av varje forum annorlunda. Det sätt på vilket ett webbforum presenterar användarinläggen är också väldigt annorlunda än hur en nyhet webbplats presenterar en enda informationsinlägg. Allt detta gör navigering och extrahering av text en svår uppgift för webbskrapor. Fokuset av detta examensarbete är utvecklingen av en webbskrapa specialiserad på forum. Tre olika metoder för textutvinning implementeras och testas innan man väljer den lämpligaste metoden för uppgiften. Metoderna är Word Count, Text Detection Framework och Text-to-Tag Ratio. Hanteringen av länk dubbleringar noga övervägd och löses genom att implementera ett flerlagers bloom filter. Examensarbetet genomförs med tillämpning av en kvalitativ metodik. Resultaten indikerar att Text-to-Tag Ratio har den bästa övergripande prestandan och ger det mest önskvärda resultatet i webbforum. Således var detta den valda metoden att behålla i den slutliga versionen av webbskrapan.
Nhlabano, Valentine Velaphi. "Fast Data Analysis Methods For Social Media Data." Diss., University of Pretoria, 2018. http://hdl.handle.net/2263/72546.
Повний текст джерелаDissertation (MSc)--University of Pretoria, 2019.
National Research Foundation (NRF) - Scarce skills
Computer Science
MSc
Unrestricted
Klock, Robert. "Quality of SQL Code Security on StackOverflow and Methods of Prevention." Oberlin College Honors Theses / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=oberlin1625831198110328.
Повний текст джерелаBalahur, Dobrescu Alexandra. "Methods and resources for sentiment analysis in multilingual documents of different text types." Doctoral thesis, Universidad de Alicante, 2011. http://hdl.handle.net/10045/19437.
Повний текст джерелаHirao, Eiji, Takeshi Furuhashi, Tomohiro Yoshikawa, and Daisuke Kobayashi. "A Study of Visualization Method with HK Graph Using Concept Words." 日本知能情報ファジィ学会, 2010. http://hdl.handle.net/2237/20687.
Повний текст джерелаSCIS & ISIS 2010, Joint 5th International Conference on Soft Computing and Intelligent Systems and 11th International Symposium on Advanced Intelligent Systems. December 8-12, 2010, Okayama Convention Center, Okayama, Japan
Pieper, Michael J. [Verfasser], and Svetlozar T. [Akademischer Betreuer] Račev. "Advanced Text Mining Methods for the Financial Markets and Forecasting of Intraday Volatility / Michael J. Pieper. Betreuer: S. T. Rachev." Karlsruhe : KIT-Bibliothek, 2011. http://d-nb.info/1018232648/34.
Повний текст джерелаIssa, Ahmad. "A method for ontology and knowledge-base assisted text mining for diabetes discussion forum." Thesis, University of Warwick, 2015. http://wrap.warwick.ac.uk/71006/.
Повний текст джерелаJoshi, Apoorva. "Trajectory-based methods to predict user churn in online health communities." Thesis, University of Iowa, 2018. https://ir.uiowa.edu/etd/6152.
Повний текст джерелаGoluchowicz, Kerstin Martina Verfasser], and Knut [Akademischer Betreuer] [Blind. "Standardisation Foresight - an indicator-based, text mining and Delphi method / Kerstin Martina Goluchowicz. Betreuer: Knut Blind." Berlin : Universitätsbibliothek der Technischen Universität Berlin, 2012. http://d-nb.info/1025931017/34.
Повний текст джерелаZeeh, Julia, Karl Ledermüller, and Michaela Kobler-Weiß. "Evaluierung von Motivationsschreiben als Instrument in universitären Aufnahmeverfahren." zfhe, 2018. http://dx.doi.org/10.3217/zfhe-13-04/13.
Повний текст джерелаSiffer, Alban. "New statistical methods for data mining, contributions to anomaly detection and unimodality testing." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1S113.
Повний текст джерелаThis thesis proposes new statistical algorithms in two different data mining areas: anomaly detection and unimodality testing. First, a new unsupervised method for detecting outliers in streaming data is developed. It is based on the computation of probabilistic thresholds, which are themselves used to discriminate against abnormal observations. The strength of this method is its ability to run automatically without prior knowledge or hypothesis about the input data. Similarly, the generic aspect of the algorithm makes it able to operate in various fields. In particular, we develop a cyber-security use case. This thesis also proposes a new unimodality test which determines whether a data distribution has one or several modes. This test is new in two respects: its ability to handle multivariate distributions but also its low complexity, allowing it to be applied on streaming data. This more fundamental component has applications mainly in other areas of data mining such as clustering. A new algorithm incrementally searching for the k-means parameter setting is notably detailed at the end of this manuscript
Chaudhary, Amit. "Supplementing consumer insights at Electrolux by mining social media: An exploratory case study." Thesis, Högskolan i Jönköping, Internationella Handelshögskolan, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-16096.
Повний текст джерелаPagliarani, Andrea. "New markov chain based methods for single and cross-domain sentiment classification." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/8445/.
Повний текст джерелаDuck, Geraint. "Extraction of database and software usage patterns from the bioinformatics literature." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/extraction-of-database-and-software-usage-patterns-from-the-bioinformatics-literature(fac16cb8-5b5b-4732-b7af-77a41cc64487).html.
Повний текст джерелаBobrik, Annette Verfasser], and Hermann [Akademischer Betreuer] [Krallmann. "Content-based Clustering in Social Corpora - A New Method for Knowledge Identification based on Text Mining and Cluster Analysis / Annette Bobrik. Betreuer: Hermann Krallmann." Berlin : Universitätsbibliothek der Technischen Universität Berlin, 2013. http://d-nb.info/1031075364/34.
Повний текст джерелаBobrik, Annette [Verfasser], and Hermann [Akademischer Betreuer] Krallmann. "Content-based Clustering in Social Corpora - A New Method for Knowledge Identification based on Text Mining and Cluster Analysis / Annette Bobrik. Betreuer: Hermann Krallmann." Berlin : Universitätsbibliothek der Technischen Universität Berlin, 2013. http://nbn-resolving.de/urn:nbn:de:kobv:83-opus-38461.
Повний текст джерелаSalehian, Ali. "PREDICTING THE DYNAMIC BEHAVIOR OF COAL MINE TAILINGS USING STATE-OF-PRACTICE GEOTECHNICAL FIELD METHODS." UKnowledge, 2013. http://uknowledge.uky.edu/ce_etds/9.
Повний текст джерелаMueller, Marianne Larissa [Verfasser], Stefan [Akademischer Betreuer] Kramer, and Frank [Akademischer Betreuer] Puppe. "Data Mining Methods for Medical Diagnosis : Test Selection, Subgroup Discovery, and Contrained Clustering / Marianne Larissa Mueller. Gutachter: Stefan Kramer ; Frank Puppe. Betreuer: Stefan Kramer." München : Universitätsbibliothek der TU München, 2012. http://d-nb.info/1024964264/34.
Повний текст джерелаSOARES, FABIO DE AZEVEDO. "AUTOMATIC TEXT CATEGORIZATION BASED ON TEXT MINING." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2013. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=23213@1.
Повний текст джерелаCONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
A Categorização de Documentos, uma das tarefas desempenhadas em Mineração de Textos, pode ser descrita como a obtenção de uma função que seja capaz de atribuir a um documento uma categoria a que ele pertença. O principal objetivo de se construir uma taxonomia de documentos é tornar mais fácil a obtenção de informação relevante. Porém, a implementação e a execução de um processo de Categorização de Documentos não é uma tarefa trivial: as ferramentas de Mineração de Textos estão em processo de amadurecimento e ainda, demandam elevado conhecimento técnico para a sua utilização. Além disso, exercendo grande importância em um processo de Mineração de Textos, a linguagem em que os documentos se encontram escritas deve ser tratada com as particularidades do idioma. Contudo há grande carência de ferramentas que forneçam tratamento adequado ao Português do Brasil. Dessa forma, os objetivos principais deste trabalho são pesquisar, propor, implementar e avaliar um framework de Mineração de Textos para a Categorização Automática de Documentos, capaz de auxiliar a execução do processo de descoberta de conhecimento e que ofereça processamento linguístico para o Português do Brasil.
Text Categorization, one of the tasks performed in Text Mining, can be described as the achievement of a function that is able to assign a document to the category, previously defined, to which it belongs. The main goal of building a taxonomy of documents is to make easier obtaining relevant information. However, the implementation and execution of Text Categorization is not a trivial task: Text Mining tools are under development and still require high technical expertise to be handled, also having great significance in a Text Mining process, the language of the documents should be treated with the peculiarities of each idiom. Yet there is great need for tools that provide proper handling to Portuguese of Brazil. Thus, the main aims of this work are to research, propose, implement and evaluate a Text Mining Framework for Automatic Text Categorization, capable of assisting the execution of knowledge discovery process and provides language processing for Brazilian Portuguese.
Baker, Simon. "Semantic text classification for cancer text mining." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275838.
Повний текст джерелаLu, Zhiyong. "Text mining on GeneRIFs /." Connect to full text via ProQuest. Limited to UCD Anschutz Medical Campus, 2007.
Знайти повний текст джерелаTypescript. Includes bibliographical references (leaves 174-182). Free to UCD affiliates. Online version available via ProQuest Digital Dissertations;
Gonçalves, Lea Silvia Martins. "Categorização em Text Mining." Universidade de São Paulo, 2002. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-22062015-202748/.
Повний текст джерелаThe technological and scientific progresses that happened in the last decades have been providing the development of methods that are more and more efficient for the storage and processing of data. It is possible to obtain knowledge through the analysis and interpretation of the data. Knowledge has become an element of fundamental importance for several organizations, due to its aiding in decision making. Most of the data available today are found in textual form, an example of this is the Internet vertiginous growth. As the texts are not structured data, it is necessary to accomplish a series of steps to transform them in structured data for a possible analysis. The process entitled Text Mining is an emergent technology and aims at analyzing great collections of documents. This masters dissertation approaches the use of different techniques and tools for Text Mining, which together with the Text pre-processing module projected and implemented by Imamura (2001), can be used for texts in Portuguese. Some algorithms, used for knowledge extraction of data, such as: Nearest Neighbor, Naive Bayes, Decision Tree, Decision Rule, Decision Table and Support Vector Machines, are explored. To verify the behavior of these algorithms for texts in Portuguese, some experiments were realized.
Al-Halimi, Reem Khalil. "Mining Topic Signals from Text." Thesis, University of Waterloo, 2003. http://hdl.handle.net/10012/1165.
Повний текст джерелаZaghloul, Waleed A. Lee Sang M. "Text mining using neural networks." Lincoln, Neb. : University of Nebraska-Lincoln, 2005. http://0-www.unl.edu.library.unl.edu/libr/Dissertations/2005/Zaghloul.pdf.
Повний текст джерелаTitle from title screen (sites viewed on Oct. 18, 2005). PDF text: 100 p. : col. ill. Includes bibliographical references (p. 95-100 of dissertation).
Rice, Simon B. "Text data mining in bioinformatics." Thesis, University of Manchester, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.488351.
Повний текст джерелаMunyana, Nicole. "Le text mining et XML." Thèse, Trois-Rivières : Université du Québec à Trois-Rivières, 2007. http://www.uqtr.ca/biblio/notice/resume/30024815R.pdf.
Повний текст джерелаTheußl, Stefan, Ingo Feinerer, and Kurt Hornik. "Distributed Text Mining in R." WU Vienna University of Economics and Business, 2011. http://epub.wu.ac.at/3034/1/Theussl_etal%2D2011%2Dpreprint.pdf.
Повний текст джерелаSeries: Research Report Series / Department of Statistics and Mathematics
Meyer, David, Kurt Hornik, and Ingo Feinerer. "Text Mining Infrastructure in R." American Statistical Association, 2008. http://epub.wu.ac.at/3978/1/textmining.pdf.
Повний текст джерелаMartins, Bruno. "Geographically Aware Web Text Mining." Master's thesis, Department of Informatics, University of Lisbon, 2009. http://hdl.handle.net/10451/14301.
Повний текст джерелаMcDonald, Daniel Merrill. "Combining Text Structure and Meaning to Support Text Mining." Diss., The University of Arizona, 2006. http://hdl.handle.net/10150/194015.
Повний текст джерелаHöckert, Linda. "Kemisk stabilisering av gruvavfall från Ljusnarsbergsfältet med mesakalk och avloppsslam." Thesis, Uppsala University, Department of Earth Sciences, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-88825.
Повний текст джерелаMine waste from Ljusnarsbergsfältet in Kopparberg, Sweden, is considered to constitute a great risk for human health and the surrounding environment. Some of the waste rock consists of sulphide minerals. When sulphide minerals come into contact with dissolved oxygen and precipitation, oxidation may occur resulting in acid mine drainage (AMD) and the release of heavy metals. The purpose of this study has been to characterise the waste material and try to chemically stabilize the waste rock with a mixture of sewage sludge and calcium carbonate. The drawback of using organic matter is the risk that dissolved organic matter can act as a complexing agent for heavy metals and in this way increase their mobility. An additional study to examine this risk has therefore also been performed.
The project started with a pilot study in order to identify the material fraction that was suitable for the experiment. When suitable material had been chosen, a column test was carried out for the purpose of studying the slurry’s influence on the mobility of metals along with the production of acidity. To clarify the organic material’s potential for complexation a pH-stat batch test was used. Drainage water samples, from the columns, were regularly taken during the experiment. These samples were analysed for pH, electrical conductivity, alkalinity, redox potential, dissolved organic carbon (DOC), sulphate and leaching metals. The effluent from the pH-stat-test were only analysed on a few occasions and only for metal content and change in DOC concentration.
The results from the laboratory experiments showed that the waste rock from Ljusnarsberg easily leached large amounts of metals. The stabilization of the waste rock succeeded in maintaining a near neutral pH in the rock waste leachate, compared to a pH 3 leachate from untreated rock waste The average concentration of copper and zinc in the leachate from untreated waste rock exceeded 100 and 1000 mg/l respectively, while these metals were detected at concentrations around 0.1 and 1 mg/l, respectively, in the leachate from the treated wastes. Examined metals had concentrations between 40 to 4000 times lower in the leachate from treated waste rock, which implies that the stabilisation with reactive amendments succeeded. The long term effects are, however, not determined. The added sludge contributed to immobilise metals at neutral pH despite a small increase in DOC concentration. The problem with adding sludge is that if pH decreases with time there is a risk of increased metal leaching.
Gruvavfallet från Ljusnarsbergsfältet i Kopparberg anses utgöra en stor risk för människors hälsa och den omgivande miljön. En del av varpmaterialet, ofyndigt berg som blir över vid malmbrytning, utgörs av sulfidhaltigt mineral. Då varpen exponeras för luft och nederbörd sker en oxidation av sulfiderna, vilket kan ge upphov till surt lakvatten och läckage av tungmetaller. Syftet med arbetet har varit att karaktärisera varpen och försöka stabilisera den med en blandning bestående av mesakalk och avloppsslam, samt att undersöka risken med det lösta organiska materialets förmåga att komplexbinda metaller och på så vis öka deras rörlighet.
Efter insamling av varpmaterial utfördes först en förstudie för att avgöra vilken fraktion av varpen som var lämplig för försöket. När lämpligt material valts ut utfördes kolonntest för att studera slam/kalk-blandningens inverkan på lakning av metaller, samt pH-statiskt skaktest för att bedöma komplexbildningspotentialen hos det organiska materialet vid olika pH värden. Från kolonnerna togs lakvattenprover kontinuerligt ut under försökets gång för analys med avseende på pH, konduktivitet, alkalinitet, redoxpotential, löst organiskt kol (DOC), sulfat och utlakade metaller. Lakvattnet från pH-stat-testet provtogs vid ett fåtal tillfällen och analyserades endast med avseende på metallhalter och förändring i DOC-halt.
Resultatet från den laborativa studien visade att varpmaterialet från Ljusnarsberg lätt lakades på stora mängder metaller. Den reaktiva tillsatsen lyckades uppbringa ett neutralt pH i lakvattnet från avfallet, vilket kan jämföras med lakvattnet från den obehandlade kolonnen som låg på ett pH kring 3. Medelhalten av koppar och zink översteg under försöksperioden 100 respektive 1000 mg/l i lakvattnet från det obehandlade avfallet, medan halterna i det behandlade materialets lakvatten låg kring 0,1 respektive 1 mg/l. Av de studerade metallerna låg halterna 40-4000 gånger lägre i lakvattnet från den behandlade kolonnen, vilket innebär att slam/kalk-blandningen har haft verkan. Stabiliseringens långtidseffekt är dock okänd. Det tillsatta slammet resulterade inte i någon större ökning av DOC-halten i det pH-intervall som åstadkoms med mesakalken. Utifrån pH-stat-försöket kunde det konstateras att det tillsatta slammet bidrog till metallernas immobilisering vid neutralt pH, trots en liten ökning av DOC-halten. Om en sänkning av pH skulle ske med tidens gång föreligger dock risk för ökat metalläckage.
Olsson, Elin. "Deriving Genetic Networks Using Text Mining." Thesis, University of Skövde, Department of Computer Science, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-708.
Повний текст джерелаOn the Internet an enormous amount of information is available that is represented in an unstructured form. The purpose with a text mining tool is to collect this information and present it in a more structured form. In this report text mining is used to create an algorithm that searches abstracts available from PubMed and finds specific relationships between genes that can be used to create a network. The algorithm can also be used to find information about a specific gene. The network created by Mendoza et al. (1999) was verified in all the connections but one using the algorithm. This connection contained implicit information. The results suggest that the algorithm is better at extracting information about specific genes than finding connections between genes. One advantage with the algorithm is that it can also find connections between genes and proteins and genes and other chemical substances.
Fivelstad, Ole Kristian. "Temporal Text Mining : The TTM Testbench." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-8764.
Повний текст джерелаThis master thesis presents the Temporal Text Mining(TTM) Testbench, an application for discovering association rules in temporal document collections. It is a continuation of work done in a project the fall of 2005 and work done in a project the fall of 2006. These projects have laid the foundation for this thesis. The focus of the work is on identifying and extracting meaningful terms from textual documents to improve the meaningfulness of the mined association rules. Much work has been done to compile the theoretical foundation of this project. This foundation has been used for assessing different approaches for finding meaningful and descriptive terms. The old TTM Testbench has been extended to include usage of WordNet, and operations for finding collocations, performing word sense disambiguation, and for extracting higher-level concepts and categories from the individual documents. A method for rating association rules based on the semantic similarity of the terms present in the rules has also been implemented. This was done in an attempt to narrow down the result set, and filter out rules which are not likely to be interesting. Experiments performed with the improved application shows that the usage of WordNet and the new operations can help increase the meaningfulness of the rules. One factor which plays a big part in this, is that synonyms of words are added to make the term more understandable. However, the experiments showed that it was difficult to decide if a rule was interesting or not, this made it impossible to draw any conclusions regarding the suitability of semantic similarity for finding interesting rules. All work on the TTM Testbench so far has focused on finding association rules in web newspapers. It may however be useful to perform experiments in a more limited domain, for example medicine, where the interestingness of a rule may be more easily decided.
Jelier, Rob. "Text mining applied to molecular biology." [S.l.] : Rotterdam : [The Author] ; Erasmus University [Host], 2008. http://hdl.handle.net/1765/10866.
Повний текст джерелаRentzmann, René. "Text mining im Customer-relationship-Management." Hamburg Kovač, 2007. http://d-nb.info/987473808/04.
Повний текст джерелаRentzmann, René. "Text Mining im Customer Relationship Management /." Hamburg : Kovač, 2008. http://www.verlagdrkovac.de/978-3-8300-3510-7.htm.
Повний текст джерелаLeroy, Gondy, Hsinchun Chen, Jesse D. Martinez, Shauna Eggers, Ryan R. Falsey, Kerri L. Kislin, Zan Huang, et al. "Genescene: Biomedical Text And Data Mining." Wiley Periodicals, Inc, 2005. http://hdl.handle.net/10150/105791.
Повний текст джерелаTo access the content of digital texts efficiently, it is necessary to provide more sophisticated access than keyword based searching. Genescene provides biomedical researchers with research findings and background relations automatically extracted from text and experimental data. These provide a more detailed overview of the information available. The extracted relations were evaluated by qualified researchers and are precise. A qualitative ongoing evaluation of the current online interface indicates that this method to search the literature is more useful and efficient than keyword based searching.
Gilli, Giacomo. "Text Mining mediante l'utilizzo di Orange." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2013. http://amslaurea.unibo.it/5041/.
Повний текст джерелаGOMES, ROBERTO MIRANDA. "WORD SENSE DESAMBIGUATION IN TEXT MINING." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2009. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=14103@1.
Повний текст джерелаEsta dissertação investigou a aplicação de processos de mineração de textos a partir de técnicas de inteligência computacional e aprendizado de máquina no problema de ambigüidade de sentido de palavras. O trabalho na área de métodos de apoio à decisão teve como objetivo o desenvolvimento de técnicas capazes de automatizar os processos de desambiguação bem como a construção de um protótipo baseado na implementação de algumas dessas técnicas. Desambiguação de sentido de palavra é o processo de atribuição de um significado a uma palavra obtido por meio de informações colhidas no contexto em que ela ocorre, e um de seus objetivos é mitigar os enganos introduzidos por construções textuais ambíguas, auxiliando assim o processo de tomada de decisão. Buscou-se ainda na utilização de conceitos, ferramentas e formas de documentação considerados em trabalhos anteriores de maneira a dar continuidade ao desenvolvimento científico e deixar um legado mais facilmente reutilizável em trabalhos futuros. Atenção especial foi dada ao processo de detecção de ambigüidades e, por esse motivo, uma abordagem diferenciada foi empregada. Diferente da forma mais comum de desambiguação, onde uma máquina é treinada para desambiguar determinado termo, buscou-se no presente trabalho a nãodependência de se conhecer o termo a ser tratado e assim tornar o sistema mais robusto e genérico. Para isso, foram desenvolvidas heurísticas específicas baseadas em técnicas de inteligência computacional. Os critérios semânticos para identificação de termos ambíguos foram extraídos das técnicas de agrupamento empregadas em léxicos construídos após algum processo de normalização de termos. O protótipo, SID - Sistema Inteligente de Desambiguação - foi desenvolvido em .NET, que permite uma grande diversidade de linguagens no desenvolvimento, o que facilita o reuso do código para a continuidade da pesquisa ou a utilização das técnicas implementadas em alguma aplicação de mineração de textos. A linguagem escolhida foi o C#, pela sua robustez, facilidade e semelhança sintática com JAVA e C++, linguagens amplamente conhecidas e utilizadas pela maioria dos desenvolvedores.
This dissertation investigated the application of text mining process from techniques of computing intelligence and machine learning in the problem of word sense ambiguity. The work in the methods of decision support area aimed to develop techniques capable of doing a word meaning disambiguation automatically and also to construct a prototype based on the application of such techniques. Special attention was given to the process of ambiguity detection and, for this reason, a differentiated approach was used. Unlikely the most common type of disambiguation, in which the machine is trained to do it in determined terms, the present work aimed to address the ambiguity problem without the need of knowing the meaning of the term used, and thus, to make the system more robust and generic. In order to achieve that, specific heurists were developed based on computing intelligence techniques. The semantic criteria used to identify the ambiguous terms were extracted from grouping techniques employed in lexis built after some term normalization process.
AGUIAR, C. Z. "Concept Maps Mining for Text Summarization." Universidade Federal do Espírito Santo, 2017. http://repositorio.ufes.br/handle/10/9846.
Повний текст джерела8 Resumo Os mapas conceituais são ferramentas gráficas para a representação e construção do conhecimento. Conceitos e relações formam a base para o aprendizado e, portanto, os mapas conceituais têm sido amplamente utilizados em diferentes situações e para diferentes propósitos na educação, sendo uma delas a represent ação do texto escrito. Mes mo um gramá tico e complexo texto pode ser representado por um mapa conceitual contendo apenas conceitos e relações que represente m o que foi expresso de uma forma mais complicada. No entanto, a construção manual de um mapa conceit ual exige bastante tempo e esforço na identificação e estruturação do conhecimento, especialmente quando o mapa não deve representar os conceitos da estrutura cognitiva do autor. Em vez disso, o mapa deve representar os conceitos expressos em um texto. Ass im, várias abordagens tecnológicas foram propostas para facilitar o processo de construção de mapas conceituais a partir de textos. Portanto, esta dissertação propõe uma nova abordagem para a construção automática de mapas conceituais como sumarização de t extos científicos. A sumarização pretende produzir um mapa conceitual como uma representação resumida do texto, mantendo suas diversas e mais importantes características. A sumarização pode facilitar a compreensão dos textos, uma vez que os alunos estão te ntando lidar com a sobrecarga cognitiva causada pela crescente quantidade de informação textual disponível atualmente. Este crescimento também pode ser prejudicial à construção do conhecimento. Assim, consideramos a hipótese de que a sumarização de um text o representado por um mapa conceitual pode atribuir características importantes para assimilar o conhecimento do texto, bem como diminuir a sua complexidade e o tempo necessário para processá - lo. Neste contexto, realizamos uma revisão da literatura entre o s anos de 1994 e 2016 sobre as abordagens que visam a construção automática de mapas conceituais a partir de textos. A partir disso, construímos uma categorização para melhor identificar e analisar os recursos e as características dessas abordagens tecnoló gicas. Além disso, buscamos identificar as limitações e reunir as melhores características dos trabalhos relacionados para propor nossa abordagem. 9 Ademais, apresentamos um processo Concept Map Mining elaborado seguindo quatro dimensões : Descrição da Fonte de Dados, Definição do Domínio, Identificação de Elementos e Visualização do Mapa. Com o intuito de desenvolver uma arquitetura computacional para construir automaticamente mapas conceituais como sumarização de textos acadêmicos, esta pesquisa resultou na ferramenta pública CMBuilder , uma ferramenta online para a construção automática de mapas conceituais a partir de textos, bem como uma api java chamada ExtroutNLP , que contém bibliotecas para extração de informações e serviços públicos. Para alcançar o objetivo proposto, direcionados esforços para áreas de processamento de linguagem natural e recuperação de informação. Ressaltamos que a principal tarefa para alcançar nosso objetivo é extrair do texto as proposições do tipo ( conceito, rela ção, conceito ). Sob essa premissa, a pesquisa introduz um pipeline que compreende: regras gramaticais e busca em profundidade para a extração de conceitos e relações a partir do texto; mapeamento de preposição, resolução de anáforas e exploração de entidad es nomeadas para a rotulação de conceitos; ranking de conceitos baseado na análise de frequência de elementos e na topologia do mapa; e sumarização de proposição baseada na topologia do grafo. Além disso, a abordagem também propõe o uso de técnicas de apre ndizagem supervisionada de clusterização e classificação associadas ao uso de um tesauro para a definição do domínio do texto e construção de um vocabulário conceitual de domínios. Finalmente, uma análise objetiva para validar a exatidão da biblioteca Extr outNLP é executada e apresenta 0.65 precision sobre o corpus . Além disso, uma análise subjetiva para validar a qualidade do mapa conceitual construído pela ferramenta CMBuilder é realizada , apresentando 0.75/0.45 para precision / recall de conceitos e 0.57/ 0.23 para precision/ recall de relações em idioma inglês e apresenta ndo 0.68/ 0.38 para precision/ recall de conceitos e 0.41/ 0.19 para precision/ recall de relações em idioma português. Ademais , um experimento para verificar se o mapa conceitual sumarizado pe lo CMBuilder tem influência para a compreensão do assunto abordado em um texto é realizado , atingindo 60% de acertos para mapas extraídos de pequenos textos com questões de múltipla escolha e 77% de acertos para m apas extraídos de textos extensos com quest ões discursivas
Hellström, Karlsson Rebecca. "Aiding Remote Diagnosis with Text Mining." Thesis, KTH, Människa och Kommunikation, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215760.
Повний текст джерелаÄmnet för detta examensarbete är hur text mining kan användas på patientrapporterade symptombeskrivningar, och hur det kan användas för att hjälpa läkare att utföra den diagnostiska processen. Sjukvården har idag svårigheter med att leverera vård till avlägsna orter, och vårdkostnader ökar i och med en åldrande population. Idag är det okänt hur text mining skulle kunna hjälpa doktorer i sitt arbete. Att undersöka om läkare blir hjälpta av att presenteras med mer information, baserat på vad patienter som skriver liknande saker som deras nuvarande patient gör, kan vara relevant för flera olika områden av sjukvården. Text mining har potential att förbättra vårdkvaliten för patienter med låg tillgänglighet till vård, till exempel på grund av avstånd. I detta arbete representerades patienttexter med en Bag-of-Words modell, och klustrades med en k-means algoritm. Den slutgiltiga klustringsmodellen använde sig av 41 kluster, och de tio viktigaste orden för klustercentroider användes för att representera respektive kluster. Därefter genomfördes ett experiment för att se om och hur läkare blev behjälpta i sin diagnostiska process, om patienters texter presenterades med de tio orden från de kluster som texterna hörde till. Resultaten från experimentet var att orden hjälpte läkarna i de mer komplicerade patientfallen, och att klustringsalgoritmen skulle kunna användas för att ställa specifika följdfrågor till patienter.
Stolt, Richard. "The Business Value of Text Mining." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-13740.
Повний текст джерела