To see the other types of publications on this topic, follow the link: Test mining.

Dissertations / Theses on the topic 'Test mining'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Test mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Parmeza, Edvin. "Experimental Evaluation of Tools for Mining Test Execution Logs." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-53531.

Full text
Abstract:
Data and software analysis tools are considered a very beneficial and advantageous approach that is used in the software industry environments. They are powerful tools that help to generate testing, web browsing and mail server statistics in different formats. These statistics are also known as logs or log files, and they can be generated in different formats, textually or visually, depending on the tool which tests them. Though these tools have been used in software industry for many years, there is still a lack of fully understanding them by software developers and testers. Literature study shows that related work on test execution log analysis is rather limited. Studies on evaluating a subset of features related to test execution logs are missing from existing literature since even those that exist are usually focused only on a one – feature comparison (e.g., fault – localization algorithms). One of the reasons for this issue might be the lack of experience or training. Some practitioners are also not fully involved with the testing tools that their companies use, so lack of time and involvement might be another reason that there are only a few experts on this field, who can understand these tools very well and find any error in a short time. This makes the need for more research on this topic even more important. In this thesis report, we presented a case study focused on the evaluation of tools which are used for analyzing test execution logs. Our work relied on three different studies: - Literature study - Experimental study - Expert - based survey study So, in order to get familiar with the topic, we started with the literature study. It helped us to investigate the current tools and approaches that exist in the software industry. It was a very important, but also difficult step, since it was hard to find research papers that are relevant to our work. Our topic was very specific, while many of research papers had performed just a general investigation on different tools. That is why in our literature search, in order to get relevant papers, we had to use specific digital libraries, terms and keywords, and a criteria for literature selection. In the next step, we experimented with two specific tools, in order to investigate their capabilities and features which they provide to analyze the execution logs. The tools we managed to work with are Splunk and Loggly. They were the only tools available for us which would comform to our thesis demands, containing the features that we needed for our work to be more complete. The last part of the study was a survey, which we sent to different experts. A total of twenty-six practitioners responded and their answers gave us a lot of useful information to enrich our work. The contributions of this thesis will be: 1. The analysis of the findings and results which are derived from the three conducted studies in order to identify the performance of the tools, the fault localization techniques they use, the test failures that occur during the test runs and conclude which one is better in these terms. 2. The proposals on how to improve further our work on log analysis tools. We explain what is needed in addition in order to understand better these tools and to provide correct results during testing.
APA, Harvard, Vancouver, ISO, and other styles
2

Myneni, Greeshma. "A System for Managing Experiments in Data Mining." University of Akron / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=akron1279373421.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Shnorhokian, Shahe. "Development of a quantitative accelerated sulphate attack test for mine back fill." Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=40712.

Full text
Abstract:
Mining operations produce large amounts of tailings that are either disposed of in surface impoundments or used in the production of backfill to be placed underground. Their mineralogy is determined by the local geology, and it is not uncommon to come across tailings with a relatively high sulphide mineral content, including pyrite and pyrrhotite. Sulphides oxidize in the presence of oxygen and water to produce sulphate and acidity. In the concrete industry, sulphate is known to produce detrimental effects by reacting with the cement paste to produce the minerals ettringite and gypsum. Because mine backfill uses tailings and binders – including cement – it is therefore prone to sulphate attack where the required conditions are met. Currently, laboratory tests on mine backfill mostly measure mechanical properties such as strength parameters, and the study of the chemical aspects is restricted to the impact of tailings on the environment. The potential of sulphate attack in mine backfill has not been studied at length, and no tests are conducted on binders used in backfill for their resistance to attack. Current ASTM guidelines for sulphate attack tests have been deemed inadequate by several authors due to their measurement of only expansion as an indicator of attack. Furthermore, the tests take too long to perform or are restricted to cement mortars only, and not to mixed binders that include pozzolans. Based on these, an accelerated test for sulphate attack was developed in this work through modifying and compiling procedures that had been suggested by different authors. Small cubes of two different binders were fully immersed in daily-monitored sodium sulphate and sulphuric acid solutions for a total of 28 days, after 7 days of accelerated curing at 50ºC. In addition, four binders were partially immersed in the same solutions for 8 days for an accelerated attack process. The two procedures were conducted in tandem with leach tests using a mixed solution of
Les opérations minières produisent de grandes quantités de rejets miniers qui sont soit stockés en surface dans des haldes, soit réutilisés comme remblais sous terre. La minéralogie de ces déchets est dictée par la géologie des lieux, et il est commun de trouver des rejets qui ont une teneur élevée en minéraux sulfurés comme la pyrite et la pyrrhotite. Les sulfures sont oxydés en présence d’eau et d’oxygène et produisent une eau acide et riche en sulfates. Dans l’industrie du béton, un des grands problèmes provient de la réaction des sulfates de sources externes avec le ciment du béton pour former de l’ettringite et du gypse. Étant donné que les remblais dans les mines se servent des rejets et d’agents de liaison comme le ciment, ils sont sensibles aux attaques des sulfates si les conditions sont propices. En ce moment dans les laboratoires, on s’intéresse surtout aux paramètres mécaniques comme la résistance en compression et l’impact chimique que les rejets miniers ont sur l’environnement. Aucune recherche concrète n’a été faite sur les dangers de l’attaque des sulfates sur les remblais dans les mines et sur les différents agents de liaisons, afin de déterminer leurs résistances à de telles attaques.Les directives actuelles de l’ASTM pour tester l’attaque des sulfates se sont avérées inadéquates. En effet, ces tests sont seulement basés sur l’expansion, ce qui ne se produit pas forcément lors de l’attaque par des sulfates. De plus, ces tests sont trop longs et ne peuvent s’appliquer qu’à certains mélanges spécifiques de ciment et pas à d’autres comme la pouzzolane. Sur ces faits, un test accéléré a été mis en place par certains chercheurs. Après un séchage accéléré dans un four à 50ºC, des échantillons sont immergés dans des solutions de sulfate de sodium et d’acide sulfurique pendant 28 jours d’une part. D’autre part, d’autres échantillons sont immergés à moi
APA, Harvard, Vancouver, ISO, and other styles
4

Hagemann, Stephan. "Masszahlen für die Assoziationsanalyse im Data-Mining Fundierung, Analyse und Test." Hamburg Diplomica-Verl, 2005. http://d-nb.info/991844947/04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tanner, Joseph Leo. "Fabrication and characterisation of multilayer test structures for coated conductor cylinder technology." Thesis, University of Birmingham, 2010. http://etheses.bham.ac.uk//id/eprint/1282/.

Full text
Abstract:
The construction of a multi-layered, multi-turn coated conductor cylinder encompasses several aspects that may limit its performance unless they are designed and fabricated in a suitable way. This project investigates the optimum thicknesses of YBa\(_2\)Cu\(_3\)O\(_7\)\(_-\)\(_8\) (YBCO) superconductor and SrTiO\(_3\) (STO) insulator layers, interconnect design between YBCO layers and the fabrication process for defining tracks in the YBCO. Test samples were produced by pulsed laser deposition (PLD), photolithographic and ion-beam and chemical etching techniques and were characterised by AC susceptibility, transport measurements, atomic force microscopy (AFM), scanning electron microscopy (SEM), electron backscattered diffraction (EBSD) and x-ray diffraction (XRD). The growth conditions produce a YBCO film that develops a strong texture even over an ion-beam milled edge. Additional steps were required to remove contaminants from the surface after photolithographic processes, with both ion-beam milling and alkaline etch proving effective. Interconnects were successfully fabricated and were most effective when a large step was ion-beam milled into the first YBCO layer, rendering a critical current density (Jc) of 8.58x10\(^5\)A/cm\(^2\). Electrical transport through a crossover was made possible by the application of an additional etching process to create a more gentle slope although further optimisation is required to improve epitaxial growth on the track edge.
APA, Harvard, Vancouver, ISO, and other styles
6

Bohannon, Stacy Jo. "Hydrogeology of the San Xavier Mining Laboratory and Geophysics Test Site and surrounding area." FIND on the Web, 1991.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
7

Bohannon, Stacy Jo 1965. "Hydrogeology of the San Xavier Mining Laboratory and Geophysics Test Site and surrounding area." Thesis, The University of Arizona, 1991. http://hdl.handle.net/10150/192053.

Full text
Abstract:
Water level, permeability, and water quality data indicate that the aquifer beneath the San Xavier Mining Laboratory is unconfined, high permeability, and isolated from the adjacent, upgradient aquifer. The aquifer at San Xavier has been dewatered considerably due to past pumping at the mine and at nearby open-pit mines. Water levels have not recovered due to the low permeability of the upgradient aquifer, and the restriction of the flow of ground water across the thrust fault separating the upgradient and San Xavier aquifers. The Mining Laboratory is in full compliance with the Aquifer Protection Permit issued to the facility by the Arizona Department of Environmental Quality. The ground water at the mine meets all state and federal primary drinking water standards. The future use of the aquifer does not appear to be threatened by research being conducted at the mine.
APA, Harvard, Vancouver, ISO, and other styles
8

Walters, Matthew. "Sulphide stress cracking test development for a weldable 13%cr supermartensitic stainless steel in simulated seabed environments." Thesis, University of Birmingham, 2016. http://etheses.bham.ac.uk//id/eprint/6726/.

Full text
Abstract:
Weldable 13%Cr supermartensitic stainless steels are commonly used for subsea pipelines in the oil and gas industry. Although classified as corrosion resistant alloys, these steels can be susceptible to Sulphide Stress Cracking (SSC) when exposed to wet environments containing chlorides, carbon dioxide and low levels of hydrogen sulphide. Standard guidelines stipulate that laboratory SSC tests are performed at 24 °C and at the maximum design temperature, however some studies suggest that the risk of SSC could be greater at temperatures below 24 °C. Seabed temperatures can be as low as 5 °C, so in-service cracking could occur following shut-down conditions even if the material has been qualified at 24 °C. Four-point bend SSC tests performed at 5 °C and 24 °C in simulated seabed environments showed the material was more susceptible to SSC at 5 °C, but only when the as-received pipe surface was compromised. A supporting stress and strain investigation highlighted strain concentrations on the test surface which were coincident with the location of cracking observed in the SSC tests. Finite element simulations were used to demonstrate that tensile stress-strain data should be used over flexural bend data to load four-point bend specimens to the desired loading strain.
APA, Harvard, Vancouver, ISO, and other styles
9

Walker, Daniel David. "Bayesian Test Analytics for Document Collections." BYU ScholarsArchive, 2012. https://scholarsarchive.byu.edu/etd/3530.

Full text
Abstract:
Modern document collections are too large to annotate and curate manually. As increasingly large amounts of data become available, historians, librarians and other scholars increasingly need to rely on automated systems to efficiently and accurately analyze the contents of their collections and to find new and interesting patterns therein. Modern techniques in Bayesian text analytics are becoming wide spread and have the potential to revolutionize the way that research is conducted. Much work has been done in the document modeling community towards this end,though most of it is focused on modern, relatively clean text data. We present research for improved modeling of document collections that may contain textual noise or that may include real-valued metadata associated with the documents. This class of documents includes many historical document collections. Indeed, our specific motivation for this work is to help improve the modeling of historical documents, which are often noisy and/or have historical context represented by metadata. Many historical documents are digitized by means of Optical Character Recognition(OCR) from document images of old and degraded original documents. Historical documents also often include associated metadata, such as timestamps,which can be incorporated in an analysis of their topical content. Many techniques, such as topic models, have been developed to automatically discover patterns of meaning in large collections of text. While these methods are useful, they can break down in the presence of OCR errors. We show the extent to which this performance breakdown occurs. The specific types of analyses covered in this dissertation are document clustering, feature selection, unsupervised and supervised topic modeling for documents with and without OCR errors and a new supervised topic model that uses Bayesian nonparametrics to improve the modeling of document metadata. We present results in each of these areas, with an emphasis on studying the effects of noise on the performance of the algorithms and on modeling the metadata associated with the documents. In this research we effectively: improve the state of the art in both document clustering and topic modeling; introduce a useful synthetic dataset for historical document researchers; and present analyses that empirically show how existing algorithms break down in the presence of OCR errors.
APA, Harvard, Vancouver, ISO, and other styles
10

Desta, Feven S., and Mike W. N. Buxton. "The use of RGB Imaging and FTIR Sensors for mineral mapping in the Reiche Zeche underground test mine, Freiberg." Technische Universitaet Bergakademie Freiberg Universitaetsbibliothek "Georgius Agricola", 2018. http://nbn-resolving.de/urn:nbn:de:bsz:105-qucosa-231302.

Full text
Abstract:
The application of sensor technologies for raw material characterization is rapidly growing, and innovative advancement of the technologies is observed. Sensors are being used as laboratory and in-situ techniques for characterization and definition of raw material properties. However, application of sensor technologies for underground mining resource extraction is very limited and highly dependent on the geological and operational environment. In this study the potential of RGB imaging and FTIR spectroscopy for the characterization of polymetallic sulphide minerals in a test case of Freiberg mine was investigated. A defined imaging procedure was used to acquire RGB images. The images were georeferenced, mosaicked and a mineral map was produced using a supervised image classification technique. Five mineral types have been identified and the overall classification accuracy shows the potential of the technique for the delineation of sulphide ores in an underground mine. FTIR data in combination with chemometric techniques were evaluated for discrimination of the test case materials. Experimental design was implemented in order to identify optimal pre-processing strategies. Using the processed data, PLS-DA classification models were developed to assess the capability of the model to discriminate the three material types. The acquired calibration and prediction statistics show the approach is efficient and provides acceptable classification success. In addition, important variables (wavelength location) responsible for the discrimination of the three materials type were identified.
APA, Harvard, Vancouver, ISO, and other styles
11

Moore, David Gerald. "AspectAssay: A Technique for Expanding the Pool of Available Aspect Mining Test Data Using Concern Seeding." NSUWorks, 2013. http://nsuworks.nova.edu/gscis_etd/254.

Full text
Abstract:
Aspect-oriented software design (AOSD) enables better and more complete separation of concerns in software-intensive systems. By extracting aspect code and relegating crosscutting functionality to aspects, software engineers can improve the maintainability of their code by reducing code tangling and coupling of code concerns. Further, the number of software defects has been shown to correlate with the number of non- encapsulated nonfunctional crosscutting concerns in a system. Aspect-mining is a technique that uses data mining techniques to identify existing aspects in legacy code. Unfortunately, there is a lack of suitably-documented test data for aspect- mining research and none that is fully representative of large-scale legacy systems. Using a new technique called concern seeding--based on the decades-old concept of error seeding--a tool called AspectAssay (akin to the radioimmunoassay test in medicine) was developed. The concern seeding technique allows researchers to seed existing legacy code with nonfunctional crosscutting concerns of known type, location, and quantity, thus greatly increasing the pool of available test data for aspect mining research. Nine seeding test cases were run on a medium-sized codebase using the AspectAssay tool. Each test case seeded a different concern type (data validation, tracing, and observer) and attempted to achieve target values for each of three metrics: 0.95 degree of scattering across methods (DOSM), 0.95 degree of scattering across classes (DOSC), and 10 concern instances. The results were manually verified for their accuracy in producing concerns with known properties (i.e., type, location, quantity, and scattering). The resulting code compiled without errors and was functionally identical to the original. The achieved metrics averaged better than 99.9% of their target values. Following the small tests, each of the three previously mentioned concern types was seeded with a wide range of target metric values on each of two codebases--one medium-sized and one large codebase. The tool targeted DOSM and DOSC values in the range 0.01 to 1.00. The tool also attempted to reach target number of concern instances from 1 to 100. Each of these 1,800 test cases was attempted ten times (18,000 total trials). Where mathematically feasible (as permitted by scattering formulas), the tests tended to produce code that closely matched target metric values. Each trial's result was expressed as a percentage of its target value. There were 903 test cases that averaged at least 0.90 of their targets. For each test case's ten trials, the standard deviation of those trials' percentages of their targets was calculated. There was an average standard deviation in all the trials of 0.0169. For the 808 seed attempts that averaged at least 0.95 of their targets, the average standard deviation across the ten trials for a particular target was only 0.0022. The tight grouping of trials for their test cases suggests a high repeatability for the AspectAssay technique and tool. The concern seeding technique opens the door for expansion of aspect mining research. Until now, such research has focused on small, well-documented legacy programs. Concern seeding has proved viable for producing code that is functionally identical to the original and contains concerns with known properties. The process is repeatable and precise across multiple seeding attempts and also accurate for many ranges of target metric values. Just like error seeding is useful in identifying indigenous errors in programs, concern seeding could also prove useful in estimating indigenous nonfunctional crosscutting concerns, thus introducing a new method for evaluating the performance of aspect mining algorithms.
APA, Harvard, Vancouver, ISO, and other styles
12

Costanzo, Bruno Pontes. "Innovation in impact assessment: a bibliometric review and a practical test." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/3/3134/tde-07112017-145017/.

Full text
Abstract:
A bibliometric study was carried out to identify the main innovations and shortcomings pointed out by scientific research on impact assessment (IA). Out of 1,547 articles published between 1990 and 2015 in two leading journals, IAPA and EIAR, 381 were reviewed for their contents related to new methodological approaches or proposals for improving practice. It was found that innovations and gaps are predominantly treated disregarding IA\'s theoretical basis. We suggest that IA core values shall always guide innovation. It is proposed that the theoretical boundaries of an IA System shall be previously stablished when discussing innovation. The information systematized through a bibliometric approach allowed to propose a framework that correlates IA theoretical foundations with innovation options in a vertical integration way.
Um estudo bibliométrico foi desenvolvido para identificar as principais inovações e lacunas apontadas pela pesquisa científica em avaliação de impactos (AI). Dos 1.547 artigos publicados entre 1990 e 2015 nos dois periódicos de maior relevância na área, o IAPA e o EIAR, 381 artigos tiveram seus conteúdos analisados em relação a novas abordagens metodológicas ou propostas para melhoria da prática. Verificou-se que as inovações e lacunas são tratadas predominantemente desconsiderando a base teórica de AI. Sugerimos que os valores fundamentais da avaliação de impactos devem sempre orientar a inovação. Propõe-se que as fronteiras teóricas de um Sistema AI sejam estabelecidas previamente ao se discutir a inovação. A informação sistematizada através de uma abordagem bibliométrica permitiu propor uma estrutura que correlaciona os fundamentos teóricos da avaliação de impactos com as opções de inovação.
APA, Harvard, Vancouver, ISO, and other styles
13

Cranley, Nikki. "The Implications for Network Recorder Design in a Networked Flight Test Instrumentation Data Acquisition System." International Foundation for Telemetering, 2011. http://hdl.handle.net/10150/595789.

Full text
Abstract:
ITC/USA 2011 Conference Proceedings / The Forty-Seventh Annual International Telemetering Conference and Technical Exhibition / October 24-27, 2011 / Bally's Las Vegas, Las Vegas, Nevada
The higher bandwidth capacities available with the adoption of Ethernet technology for networked FTI data acquisition systems enable more data to be acquired. However, this puts increased demands on the network recorder to be able to support such data rates. During any given flight, the network recorder may log hundreds of GigaBytes of data, which must be processed and analyzed in real-time or in post-flight. This paper describes several approaches that may be adopted to facilitate data-on-demand data mining and data reduction operations. In particular, the use of filtering and indexing techniques that may be adopted to address this challenge are described.
APA, Harvard, Vancouver, ISO, and other styles
14

Desta, Feven S., and Mike W. N. Buxton. "The use of RGB Imaging and FTIR Sensors for mineral mapping in the Reiche Zeche underground test mine, Freiberg." TU Bergakademie Freiberg, 2017. https://tubaf.qucosa.de/id/qucosa%3A23190.

Full text
Abstract:
The application of sensor technologies for raw material characterization is rapidly growing, and innovative advancement of the technologies is observed. Sensors are being used as laboratory and in-situ techniques for characterization and definition of raw material properties. However, application of sensor technologies for underground mining resource extraction is very limited and highly dependent on the geological and operational environment. In this study the potential of RGB imaging and FTIR spectroscopy for the characterization of polymetallic sulphide minerals in a test case of Freiberg mine was investigated. A defined imaging procedure was used to acquire RGB images. The images were georeferenced, mosaicked and a mineral map was produced using a supervised image classification technique. Five mineral types have been identified and the overall classification accuracy shows the potential of the technique for the delineation of sulphide ores in an underground mine. FTIR data in combination with chemometric techniques were evaluated for discrimination of the test case materials. Experimental design was implemented in order to identify optimal pre-processing strategies. Using the processed data, PLS-DA classification models were developed to assess the capability of the model to discriminate the three material types. The acquired calibration and prediction statistics show the approach is efficient and provides acceptable classification success. In addition, important variables (wavelength location) responsible for the discrimination of the three materials type were identified.
APA, Harvard, Vancouver, ISO, and other styles
15

Siffer, Alban. "New statistical methods for data mining, contributions to anomaly detection and unimodality testing." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1S113.

Full text
Abstract:
Cette thèse propose de nouveaux algorithmes statistiques dans deux domaines différents de la fouille de données: la détection d'anomalies et le test d'unimodalité.Premièrement, une nouvelle méthode non-supervisée permettant de détecter des anomalies dans des flux de données est développée. Celle-ci se base sur le calcul de seuils probabilistes, eux-mêmes utilisés pour discriminer les observations anormales.La force de cette méthode est sa capacité à s'exécuter automatiquement sans connaissance préalable ni hypothèse sur le flux de données d'intérêt.De même, l'aspect générique de l'algorithme lui permet d'opérer dans des domaines d'application variés. En particulier, nous développons un cas d'usage en cyber-sécurité.Cette thèse développe également un nouveau test d'unimodalité qui permet de déterminer si une distribution de données comporte un ou plusieurs modes. Ce test est nouveau par deux aspects: sa capacité à traiter des distributions multivariées mais également sa faible complexité, lui permettant alors d'être appliqué en temps réel sur des flux de données.Cette composante plus fondamentale a principalement des applications dans d'autres domaines du data mining tels que le clustering. Un nouvel algorithme cherchant incrémentalement le paramétrage de k-means est notamment détaillé à la fin de ce manuscrit
This thesis proposes new statistical algorithms in two different data mining areas: anomaly detection and unimodality testing. First, a new unsupervised method for detecting outliers in streaming data is developed. It is based on the computation of probabilistic thresholds, which are themselves used to discriminate against abnormal observations. The strength of this method is its ability to run automatically without prior knowledge or hypothesis about the input data. Similarly, the generic aspect of the algorithm makes it able to operate in various fields. In particular, we develop a cyber-security use case. This thesis also proposes a new unimodality test which determines whether a data distribution has one or several modes. This test is new in two respects: its ability to handle multivariate distributions but also its low complexity, allowing it to be applied on streaming data. This more fundamental component has applications mainly in other areas of data mining such as clustering. A new algorithm incrementally searching for the k-means parameter setting is notably detailed at the end of this manuscript
APA, Harvard, Vancouver, ISO, and other styles
16

Najah, Idrissi Amel. "Contribution à l'unification de critères d'association pour variables qualitatives." Paris 6, 2000. http://www.theses.fr/2000PA066348.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Kurin, Erik, and Adam Melin. "Data-driven test automation : augmenting GUI testing in a web application." Thesis, Linköpings universitet, Programvara och system, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-96380.

Full text
Abstract:
For many companies today, it is highly valuable to collect and analyse data in order to support decision making and functions of various sorts. However, this kind of data-driven approach is seldomly applied to software testing and there is often a lack of verification that the testing performed is relevant to how the system under test is used. Therefore, the aim of this thesis is to investigate the possibility of introducing a data-driven approach to test automation by extracting user behaviour data and curating it to form input for testing. A prestudy was initially conducted in order to collect and assess different data sources for augmenting the testing. After suitable data sources were identified, the required data, including data about user activity in the system, was extracted. This data was then processed and three prototypes where built on top of this data. The first prototype augments the model-based testing by automatically creating models of the most common user behaviour by utilising data mining algorithms. The second prototype tests the most frequent occurring client actions. The last prototype visualises which features of the system are not covered by automated regression testing. The data extracted and analysed in this thesis facilitates the understanding of the behaviour of the users in the system under test. The three prototypes implemented with this data as their foundation can be used to assist other testing methods by visualising test coverage and executing regression tests.
APA, Harvard, Vancouver, ISO, and other styles
18

Ricci, Mattia. "Sentiment analysis su test prenatali: un caso di studio basato su Twitter e reddit." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.

Find full text
Abstract:
Vivendo in un'era in cui la tecnologia è alla portata di tutti, le informazioni relative ad eventi, oggetti e persone sono divenute chiare e veloci da reperire per tutti, soprattutto per i ricercatori, i quali non sono più costretti a richiedere informazioni persona per persona, ma possono trovare ciò che cercano direttamente in rete. Per questo motivo, i sondaggi tradizionali non si presentano più come strumenti ottimali per far fronte al costante divenire della (cyber)opinione; sono costosi da progettare e implementare, non che statici per definizione. Negli ultimi 15 anni è divenuta sempre più popolare la sentiment analysis, termine che indica un approccio indirizzato classificare le opinioni contenute in un testo scritto, tramite processi informatici, al fine di estrarre informazioni soggettive, opinioni e sentimenti dalle fonti di analisi osservate. La sua notorità è andata di pari passo con la massiccia diffusione ed importanza acquistata dai social network quali Twitter, Facebook e Google+, proprio perchè questi social network sono diventati un mezzo tramite il quale gli utenti si possono esprimere, riportando i momenti positivi e negativi passati durante la giornata. Il progetto di tesi sviluppato nasce da una collaborazione con il Detroit Medical Center e la Wayne State University, in seguito a progressi sostanziali avvenuti di recente nel campo medico delle diagnosi prenatali. Vi era la necessità di compiere un'analisi automatizzata dell'umore dei pazienti ai quali vengono prescritti una serie di differenti esami prenatali. Questo studio si pone quindi l'obiettivo di permettere una visione grafica sentiment-oriented di una serie di parole chiave che verranno all’interno del documento di tesi. I dati raccolti copriranno un intervallo di tempo di 7 anni, e sarà interessante analizzare il cambiamento nell'intensità del Sentiment delle tecniche tradizionali con l'avanzare degli anni e con le nuove possibilità tecnologiche che hanno fornito le nuove alternative
APA, Harvard, Vancouver, ISO, and other styles
19

Mulazzani, Alberto. "Social media sensing: Twitter e Reddit come casi di studio e comparazione applicati ai test prenatali." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text
Abstract:
Avere un figlio per molte persone può essere la gioia più grande della loro vita, ma la gravidanza è uno dei momenti più delicati della vita di una donna e come tale va controllata accuratamente in ogni suo aspetto. Non sempre questo processo è esente da rischi, e quello che è un momento di felicità si può trasformare in un momento difficile. Questo studio si prefigge l'obiettivo di permettere una visione grafica sentiment-oriented di una serie di parole chiave riferite al mondo delle diagnosi prenatali e dei test prenatali. Saranno presentati i dati ottenuti da due piattaforme di Social Networking: Reddit e Twitter nel lasso temporale che va dal 01/01/2011 al 31/03/2018 per rispondere a due domande fondamentali: Quanto è cambiato il volume di dati durante il periodo di tempo analizzato, con valore unitario di un mese e quanto è cambiato il sentiment o opinione dei dati durante il periodo di tempo analizzato, con valore unitario di un mese.
APA, Harvard, Vancouver, ISO, and other styles
20

Avsar, Casatay. "Breakage Characteristics Of Cement Components." Phd thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/2/587147/index.pdf.

Full text
Abstract:
The production of multi-component cement from clinker and two additives such as trass and blast furnace slag has now spread throughout the world. These additives are generally interground with clinker to produce a composite cement of specified surface area. The grinding stage is of great importance as it accounts for a major portion of the total energy consumed in cement production and also as it affects the quality of composite cements by the particle size distribution of the individual additives produced during grinding. This thesis study was undertaken to characterize the breakage properties of clinker and the additives trass and slag with the intention of delineating their grinding properties in separate and intergrinding modes. Single particle breakage tests were conducted by means of a drop weight tester in order to define an inherent grindability for the clinker and trass samples in terms of the median product size ( ). In addition, a back-calculation procedure was applied to obtain the breakage rate parameters ( ) of perfect mixing ball mill model using industrial data from a cement plant. Kinetic and locked-cycle grinding tests were performed in a standard Bond mill to determine breakage rates and distribution functions for clinker, trass and slag. Bond work indices of these cement components and of their binary and ternary mixtures were determined and compared. Attempts were made to use back-calculated grinding rate parameters to simulate the Bond grindability test. The self-similarity law was proved to be true for clinker and trass that their shapes of the self-similarity curves are unique to the feed material and independent of the grinding energy expended and overall fineness attained. The self-similar behaviour of tested materials will enable process engineers to get useful information about inherent grindability and energy consumption in any stage of the comminution process. The parameters, and indicating the degree of size reduction were defined with different theoretical approaches as a function of energy consumption by using single particle breakage test data of clinker and trass. The breakage distribution functions were found to be non-normalizable. On the other hand, the breakage rate functions were found to be constant with respect to time but variable with respect to changing composition in the Bond ball mill. These variations are critical in computer simulation of any test aiming to minimize the experimental efforts of the standard procedure. As a result of the back calculation of breakage rate parameters for clinker and trass samples in the Bond mill, no common pattern was seen for the variation of the rate parameters. Therefore, computer simulation of the Bond grindability test did not result in an accurate estimation of the Bond work index.
APA, Harvard, Vancouver, ISO, and other styles
21

Pelosi, Serena. "Detecting subjectivity through lexicon-grammar. strategies databases, rules and apps for the italian language." Doctoral thesis, Universita degli studi di Salerno, 2016. http://hdl.handle.net/10556/2208.

Full text
Abstract:
2014 - 2015
The present research handles the detection of linguistic phenomena connected to subjectivity, emotions and opinions from a computational point of view. The necessity to quickly monitor huge quantity of semi-structured and unstructured data from the web, poses several challenges to Natural Language Processing, that must provide strategies and tools to analyze their structures from a lexical, syntactical and semantic point of views. The general aim of the Sentiment Analysis, shared with the broader fields of NLP, Data Mining, Information Extraction, etc., is the automatic extraction of value from chaos; its specific focus instead is on opinions rather than on factual information. This is the aspect that differentiates it from other computational linguistics subfields. The majority of the sentiment lexicons has been manually or automatically created for the English language; therefore, existent Italian lexicons are mostly built through the translation and adaptation of the English lexical databases, e.g. SentiWordNet and WordNet-Affect. Unlike many other Italian and English sentiment lexicons, our database SentIta, made up on the interaction of electronic dictionaries and lexicon dependent local grammars, is able to manage simple and multiword structures, that can take the shape of distributionally free structures, distributionally restricted structures and frozen structures. Moreover, differently from other lexicon-based Sentiment Analysis methods, our approach has been grounded on the solidity of the Lexicon-Grammar resources and classifications, that provides fine-grained semantic but also syntactic descriptions of the lexical entries. According with the major contribution in the Sentiment Analysis literature, we did not consider polar words in isolation. We computed they elementary sentence contexts, with the allowed transformations and, then, their interaction with contextual valence shifters, the linguistic devices that are able to modify the prior polarity of the words from SentIta, when occurring with them in the same sentences. In order to do so, we took advantage of the computational power of the finite-state technology. We formalized a set of rules that work for the intensification, downtoning and negation modeling, the modality detection and the analysis of comparative forms. With regard to the applicative part of the research, we conducted, with satisfactory results, three experiments on the same number of Sentiment Analysis subtasks: the sentiment classification of documents and sentences, the feature-based Sentiment Analysis and the Semantic Role Labeling based on sentiments. [edited by author]
XIV n.s.
APA, Harvard, Vancouver, ISO, and other styles
22

Mueller, Marianne Larissa [Verfasser], Stefan [Akademischer Betreuer] Kramer, and Frank [Akademischer Betreuer] Puppe. "Data Mining Methods for Medical Diagnosis : Test Selection, Subgroup Discovery, and Contrained Clustering / Marianne Larissa Mueller. Gutachter: Stefan Kramer ; Frank Puppe. Betreuer: Stefan Kramer." München : Universitätsbibliothek der TU München, 2012. http://d-nb.info/1024964264/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Yanik, Todd E. "Detection of erroneous payments utilizing supervised and utilizing supervised and unsupervised data mining techniques." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2004. http://library.nps.navy.mil/uhtbin/hyperion/04Sep%5FYanik.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Alkilicgil, Cigdem. "Development Of A New Method For Mode I Fracture Toughness Test On Disc Type Rock Specimens." Master's thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607513/index.pdf.

Full text
Abstract:
A new testing method was introduced and developed to determine Mode I fracture toughness of disc type rock specimens. The new method was named as Straight Notched Disc Bending and it uses disc specimens under three-point bending. 3D Numerical modeling was carried out with a finite element program ABAQUS to find stress intensity factors for both well-known Semi-circular Bending specimen models and Straight Notched Disc Bending specimen models for varying disc geometries. Both specimen types included notches where a crack front is introduced at the tip of the notch to compute the stress intensity factors. For stress intensity analysis, crack front-upper loading point distance and span length between the two roller supports at the bottom boundary of the specimens were changed. Fracture toughness testing was carried on Ankara Gö
lbaSi pink colored andesite for both specimen types
crack front-upper loading point distance and span length between the two roller supports at the bottom boundary of the specimens were changed during the tests. For both specimen geometries, notch lengths changing from 5 mm to 20 mm were used. For each notch length, two different roller supports with span lengths 60 mm and 70 mm were used. For both methods, fracture toughness values determined by using numerically computed stress intensity factors and failure loads obtained from the experiments were very close
the new method was verified by comparing the results. The new method had advantages of lower confining pressure at the crack front and lower stress intensities with a possible smaller crack tip plasticity region.
APA, Harvard, Vancouver, ISO, and other styles
25

Camalan, Mahmut. "Size-by-size Analysis Of Breakage Parameters Of Cement Clinker Feed And Product Samples Of An Industrial Roller Press." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614594/index.pdf.

Full text
Abstract:
The main objective in this study is to compare breakage parameters of narrow size fractions of cement clinker taken from the product end and feed end of industrial-scale high pressure grinding rolls (HPGR) in order to assess whether the breakage parameters of clinker broken in HPGR are improved or not. For this purpose, drop weight tests were applied to six narrow size fractions above 3.35 mm, and batch grinding tests were applied to three narrow size fractions below 3.35 mm. It was found that the breakage probabilities of coarse sizes and breakage rates in fine sizes were higher in the HPGR product. This indicated that clinker broken by HPGR contained weaker particles due to cracks and damage imparted. However, no significant weakening was observed for the -19.0+12.7 mm HPGR product. Although HPGR product was found to be weaker than HPGR feed, fragment size distribution of HPGR product did not seem to be finer than that of the HPGR feed at a given loading condition in either the drop weight test or batch grinding test. Also, drop weight tests on HPGR product and HPGR feed showed that the breakage distribution functions of coarse sizes depended on particle size and impact energy (J). Batch grinding tests showed that the specific breakage rates of HPGR product and HPGR feed were non-linear which could be represented with a fast initial breakage rate and a subsequent slow breakage rate. The fast breakage rates of each size fraction of HPGR product were higher than HPGR feed due to cracks induced in clinker by HPGR. However, subsequent slow breakage rates of HPGR product were close to those of HPGR feed due to elimination of cracks and disappearance of weaker particles. Besides, the variation in breakage rates of HPGR product and HPGR feed with ball size and particle size also showed an abnormal breakage zone where ball sizes were insufficient to effectively fracture the coarse particles. Breakage distribution functions of fine sizes of HPGR product and HPGR feed were non-normalizable and depended on particle size to be ground. However, batch grinding of -2.36+1.7 mm and -1.7+1.18 mm HPGR feed yielded the same breakage pattern.
APA, Harvard, Vancouver, ISO, and other styles
26

Chanov, Michael Kiprian. "Potential Coal Slurry Toxicity to Laboratory and Field Test Organisms in the Clinch River Watershed and the Ecotoxicological Recovery of Two Remediated Acid Mine Drainage Streams in the Powell River Watershed, Virginia." Thesis, Virginia Tech, 2009. http://hdl.handle.net/10919/33904.

Full text
Abstract:
The Clinch and Powell Rivers located in Southwestern Virginia contain some of the most diverse freshwater mussel assemblages found throughout North America. However, in recent decades mussel species decline has been documented by researchers. The presence of coal mining activity in the watersheds has been hypothesized to be linked to the decline of numerous species and the extirpation of others. The effects of various discharges from an active coal preparation plant facility located in Honaker, Virginia were evaluated for acute and chronic toxicity using field and laboratory tests. The results of the study suggested that the primary effluent from the coal preparation facility had acute and chronic toxicity; however, the settling pond system utilized at this plant mitigated the impacts of the plant from reaching the Clinch River. Along with active mine discharges, acid mine drainage (AMD) has been documented as another potential stressor. Ecotoxicological recovery was evaluated in two acid mine drainage impacted subwatersheds (Black Creek and Ely Creek) in the Powell River watershed following remediation. The results in Ely Creek suggested that successive alkalinity producing systems were effective in mitigating the harmful impacts of AMD as previously impacted sites had decreased water column aluminum and iron levels in conjunction with increased survival in laboratory toxicity tests conducted with Ceriodaphnia dubia and Daphnia magna. Corbicula fluminea (Asian clam) in-situ tests confirmed the results in the laboratory tests as all sites located below the remediated areas had improved survival. However, active AMD influences and loss of quality habitat seemed to be hindering the recovery of the benthic macroinvertebrate community located in Ely Creek. In Black Creek, re-mining and outlet control pond construction have not resulted in a successful remediation in the lower subwatershed. A decrease in Ecotoxicological Ratings at some of the lowest mainstem sites compared to pre-remediation data was observed. Furthermore, decreased survival in sediment associated toxicity tests with D. magna in 2007-08 was supported by 100% Asian clam mortality at the LBC-5 and LBC-6 sites in 2007, while growth impairment in 2008 was observed at the LBC-6 site.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
27

Höckert, Linda. "Kemisk stabilisering av gruvavfall från Ljusnarsbergsfältet med mesakalk och avloppsslam." Thesis, Uppsala University, Department of Earth Sciences, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-88825.

Full text
Abstract:

Mine waste from Ljusnarsbergsfältet in Kopparberg, Sweden, is considered to constitute a great risk for human health and the surrounding environment. Some of the waste rock consists of sulphide minerals. When sulphide minerals come into contact with dissolved oxygen and precipitation, oxidation may occur resulting in acid mine drainage (AMD) and the release of heavy metals. The purpose of this study has been to characterise the waste material and try to chemically stabilize the waste rock with a mixture of sewage sludge and calcium carbonate. The drawback of using organic matter is the risk that dissolved organic matter can act as a complexing agent for heavy metals and in this way increase their mobility. An additional study to examine this risk has therefore also been performed.

The project started with a pilot study in order to identify the material fraction that was suitable for the experiment. When suitable material had been chosen, a column test was carried out for the purpose of studying the slurry’s influence on the mobility of metals along with the production of acidity. To clarify the organic material’s potential for complexation a pH-stat batch test was used. Drainage water samples, from the columns, were regularly taken during the experiment. These samples were analysed for pH, electrical conductivity, alkalinity, redox potential, dissolved organic carbon (DOC), sulphate and leaching metals. The effluent from the pH-stat-test were only analysed on a few occasions and only for metal content and change in DOC concentration.

The results from the laboratory experiments showed that the waste rock from Ljusnarsberg easily leached large amounts of metals. The stabilization of the waste rock succeeded in maintaining a near neutral pH in the rock waste leachate, compared to a pH 3 leachate from untreated rock waste The average concentration of copper and zinc in the leachate from untreated waste rock exceeded 100 and 1000 mg/l respectively, while these metals were detected at concentrations around 0.1 and 1 mg/l, respectively, in the leachate from the treated wastes. Examined metals had concentrations between 40 to 4000 times lower in the leachate from treated waste rock, which implies that the stabilisation with reactive amendments succeeded. The long term effects are, however, not determined. The added sludge contributed to immobilise metals at neutral pH despite a small increase in DOC concentration. The problem with adding sludge is that if pH decreases with time there is a risk of increased metal leaching.


Gruvavfallet från Ljusnarsbergsfältet i Kopparberg anses utgöra en stor risk för människors hälsa och den omgivande miljön. En del av varpmaterialet, ofyndigt berg som blir över vid malmbrytning, utgörs av sulfidhaltigt mineral. Då varpen exponeras för luft och nederbörd sker en oxidation av sulfiderna, vilket kan ge upphov till surt lakvatten och läckage av tungmetaller. Syftet med arbetet har varit att karaktärisera varpen och försöka stabilisera den med en blandning bestående av mesakalk och avloppsslam, samt att undersöka risken med det lösta organiska materialets förmåga att komplexbinda metaller och på så vis öka deras rörlighet.

Efter insamling av varpmaterial utfördes först en förstudie för att avgöra vilken fraktion av varpen som var lämplig för försöket. När lämpligt material valts ut utfördes kolonntest för att studera slam/kalk-blandningens inverkan på lakning av metaller, samt pH-statiskt skaktest för att bedöma komplexbildningspotentialen hos det organiska materialet vid olika pH värden. Från kolonnerna togs lakvattenprover kontinuerligt ut under försökets gång för analys med avseende på pH, konduktivitet, alkalinitet, redoxpotential, löst organiskt kol (DOC), sulfat och utlakade metaller. Lakvattnet från pH-stat-testet provtogs vid ett fåtal tillfällen och analyserades endast med avseende på metallhalter och förändring i DOC-halt.

Resultatet från den laborativa studien visade att varpmaterialet från Ljusnarsberg lätt lakades på stora mängder metaller. Den reaktiva tillsatsen lyckades uppbringa ett neutralt pH i lakvattnet från avfallet, vilket kan jämföras med lakvattnet från den obehandlade kolonnen som låg på ett pH kring 3. Medelhalten av koppar och zink översteg under försöksperioden 100 respektive 1000 mg/l i lakvattnet från det obehandlade avfallet, medan halterna i det behandlade materialets lakvatten låg kring 0,1 respektive 1 mg/l. Av de studerade metallerna låg halterna 40-4000 gånger lägre i lakvattnet från den behandlade kolonnen, vilket innebär att slam/kalk-blandningen har haft verkan. Stabiliseringens långtidseffekt är dock okänd. Det tillsatta slammet resulterade inte i någon större ökning av DOC-halten i det pH-intervall som åstadkoms med mesakalken. Utifrån pH-stat-försöket kunde det konstateras att det tillsatta slammet bidrog till metallernas immobilisering vid neutralt pH, trots en liten ökning av DOC-halten. Om en sänkning av pH skulle ske med tidens gång föreligger dock risk för ökat metalläckage.

APA, Harvard, Vancouver, ISO, and other styles
28

Canul, Reich Juana. "An Iterative Feature Perturbation Method for Gene Selection from Microarray Data." Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1588.

Full text
Abstract:
Gene expression microarray datasets often consist of a limited number of samples relative to a large number of expression measurements, usually on the order of thousands of genes. These characteristics pose a challenge to any classification model as they might negatively impact its prediction accuracy. Therefore, dimensionality reduction is a core process prior to any classification task. This dissertation introduces the iterative feature perturbation method (IFP), an embedded gene selector that iteratively discards non-relevant features. IFP considers relevant features as those which after perturbation with noise cause a change in the predictive accuracy of the classification model. Non-relevant features do not cause any change in the predictive accuracy in such a situation. We apply IFP to 4 cancer microarray datasets: colon cancer (cancer vs. normal), leukemia (subtype classification), Moffitt colon cancer (prognosis predictor) and lung cancer (prognosis predictor). We compare results obtained by IFP to those of SVM-RFE and the t-test using a linear support vector machine as the classifier in all cases. We do so using the original entire set of features in the datasets, and using a preselected set of 200 features (based on p values) from each dataset. When using the entire set of features, the IFP approach results in comparable accuracy (and higher at some points) with respect to SVM-RFE on 3 of the 4 datasets. The simple t-test feature ranking typically produces classifiers with the highest accuracy across the 4 datasets. When using 200 features chosen by the t-test, the accuracy results show up to 3% performance improvement for both IFP and SVM-RFE across the 4 datasets. We corroborate these results with an AUC analysis and a statistical analysis using the Friedman/Holm test. Similar to the application of the t-test, we used the methodsinformation gain and reliefF as filters and compared all three. Results of the AUC analysis show that IFP and SVM-RFE obtain the highest AUC value when applied on the t-test-filtered datasets. This result is additionally corroborated with statistical analysis. The percentage of overlap between the gene sets selected by any two methods across the four datasets indicates that different sets of genes can and do result in similar accuracies. We created ensembles of classifiers using the bagging technique with IFP, SVM-RFE and the t-test, and showed that their performance can be at least equivalent to those of the non-bagging cases, as well as better in some cases.
APA, Harvard, Vancouver, ISO, and other styles
29

Echols, Brandi Shontia. "Use of an environmentally realistic laboratory test organism and field bioassessments to determine the potential impacts of active coal mining in the Dumps Creek subwatershed on the Clinch River, Virginia." Diss., Virginia Tech, 2011. http://hdl.handle.net/10919/77326.

Full text
Abstract:
This research was divided into four objectives for assessing the impacts of coal mining on ecosystem health. The first objective was to provide an ecotoxicological assessment in the upper Clinch River using standard bioassessment techniques. Analysis of sediments and interstitial water (porewater) indicate higher concentrations of trace metals in samples from sites located above both a power plant (CRP) and Dumps Creek mining influences. The furthest sampling site located near Pounding Mill, Virginia (CR-PM) had higher concentrations of aluminum (2,250.9 mg/kg), copper (5.9 mg/kg) and iron (12,322.6 mg/kg) compared to samples collected directly below the Dumps Creek confluence (site CR-2). Similar results were obtained from bioaccumulation in-situ tests with the Asian clam (Corbicula fluminea) in 2009. Aluminum (7.81 mg/kg), Fe (48.25 mg/kg) and Zn (7.69 mg/kg) were accumulated in higher concentrations at CR-PM site than CR-2. However, the site located below the CRP effluent discharges (CR-3L) on the left bank had substantially higher concentrations of Al (14.19 mg/kg), Cu (6.78 mg/kg), Fe (88.78 mg/kg) and Zn (7.75 mg/kg) than both CR-PM and samples collected directly opposite of this site at CR-3R. To further understand the potential impact active mining on the Clinch River, a more comprehensive ecotoxicological evaluation was conducting in the Dumps Creek subwatershed. Field bioassessments determined that biological impairment occurred directly below a deep mine discharge (CBP 001), which was characterized by a distinct hydrogen sulfide odor. Total abundance and richness of benthic macroinvertebrates decreased to 3.5-20 and 1.25-2.3, respectively at DC-1 Dn. The discharge also caused the proliferation of a sulfur-oxidizing bacterium, Thiothrix nivea. During continuous discharge of the effluent, the bacteria was observed coating all surfaces at DC-1 Dn and may also contribute to an Fe-encrusted biofilm observed on in-situ clams at downstream site, DC-2 Dn. Toxicity tests with mining effluents indicate some potential toxicity of the 001 discharge, but this was variable between test organisms. Selecting the most appropriate test species for sediment and water column assays has been a primary goal for ecotoxicologists. Standard test organisms and established test guidelines exist, but US EPA recommended species may not be the most sensitive organisms to anthropogenic inputs. Therefore, Chapter Three and Four addressed the use of mayflies in routine laboratory testing. Preliminary results of toxicity tests with the mayfly, Isonychia sp. (Ephemeroptera) suggested that Isonychia were moderately sensitive to NaCl after 96-hr with an average LC50 value of 3.10 g NaCl/L. When exposed to a coal-mine processed effluent, Isonychia generated LC50 values that ranged from 13 to 39% effluent and were more sensitive to the effluent than Ceriodaphnia dubia. Based on results of the feasibility study in presented in Chapter Four, field collected organisms appear to be too unpredictable in test responses and therefore, such tests would be unreliable as stand-alone indicators of effluent toxicity.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
30

Sorrenti, Estelle. "Étude de la passivation de la pyrite : chimie de surface et réactivité." Thesis, Vandoeuvre-les-Nancy, INPL, 2007. http://www.theses.fr/2007INPL054N/document.

Full text
Abstract:
Afin de lutter contre les phénomènes de drainage minier acide DMA, nous avons étudié la possibilité de passivation/inertage de rejets miniers sulfurés. L'inhibition de l'oxydation superficielle de phases pyriteuses a été effectuée par adsorption de molécules: acide humique HA, thymol et silicate de sodium. L'étude fondamentale réalisée sur une pyrite pure (masse 1-5g) a ensuite été conduite à des rejets miniers (masse 2 kg). L’adsorption de molécules passivantes a été réalisée dans des conditions dynamiques (colonne chromatographique) et statiques (batch). L’ordre d’efficacité est: acide humique> thymol>silicate de sodium. Les essais dynamiques ont montré que l’adsorption d'HA sur la pyrite est irréversible. L’étude par voltamétrie cyclique a montré que de faibles concentrations en HA adsorbée (de 0,15 à 0,3mg/g–[thêta]<1) sont suffisantes pour bloquer plus de 90% de l’activité électrochimique initiale. L’analyse de la surface par la spectroscopie IR en mode réflexion diffuse a mis en évidence l’importance des phases oxydées superficielles dans le processus d’adsorption. La description des fronts chromatographiques a été possible à partir du modèle trimodal dynamique basé sur l’existence de trois sites d’adsorption dont la nature chimique, le nombre et l’accessibilité évoluent pendant l’adsorption. D'autres expériences conduites en cellules humides simulant le comportement d’un stérile minier d'Abitibi-Témiscamingue en conditions naturels de stockage, ont montré que le traitement à l’HA est efficace pendant plus de 30 équivalent-années. Aussi, un stérile traité avec HA ne génère plus de DMA alors que celui non traité est générateur d’acide pendant les 6 premières années
To fight against the phenomena of acid mine drainage DMA, we studied the possibility of passivation/inertage of sulphurized mining discharges. The inhibition of the superficial oxidation of pyriteuses phases was made by adsorption of molecules: acid humique HA, thymol and silicate of sodium. The fundamental study realized on a pure pyrite (mass 1-5g) was then driven to mining refusals (masse 2 kg). The adsorption of passivantes molecules was realized in dynamic conditions (chromatographic column) and statics (batch) . The order of efficiency is: acid humique > thymol > silicate of sodium. The dynamic experiments showed that the adsorption of HA on the pyrite is irreversible. The study by cyclic voltammetry showed that weak concentrations in adsorbed HA (of 0,15 in 0,3mg/g–[thêta]<1) are sufficient to block more than 90 % of the initial electrochemical activity. The analysis of the surface by the spectroscopy IR in mode diffuse reflection put in evidence the importance of the superficial oxidized phases in the process of adsorption. The description of chromatographic fronts was possible from the model dynamic trimodal based on the existence of three sites of adsorption among which the chemical nature, the number and the accessibility evolve during the adsorption. Other experiments led in wet cells feigning the behavior of sterile one mining of natural Abitibi-Témiscamingue in conditions of storage, showed that the treatment in the HA is effective counterpart more than 30 equivalents-years. So, sterile one treated with HA generate no more DMA while that untreated is generative of acid during the first 6 years
APA, Harvard, Vancouver, ISO, and other styles
31

Wagler, Marit. "Effekte von abwasserinduzierten Ionenimbalanzen auf die Reproduktion von Fischen am Beispiel von Danio rerio." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/21640.

Full text
Abstract:
Die salzhaltigen Abwässer des Kalibergbaues führen in natürlichen Süßwassersystemen zu Ionenimbalanzen und einer sekundären Versalzung der Werra in Deutschland. Die Auswirkungen dieser Ionenungleichgewichte auf die Reproduktion von Süßwasserfischen wurden unter Verwendung der Modellfischart Danio rerio untersucht. Angepasst an die aktuellen Grenzwerte für die Einleitung der Abwässer aus dem Kalibergbau wurden fünf verschiedene Kombinationen erhöhter Ionenkonzentrationen getestet. Während eines partiellen Lebenszyklustests wurden adulte Fische 35 Tage lang in den Salzkombinationen exponiert. Anschließend wurden die Nachkommen dieser Fische bis zum 8. Tag nach der Befruchtung den gleichen Salzkonzentrationen ausgesetzt. Zusätzlich wurden Early-Life-Stage-Tests (ELST) mit den Nachkommen von nicht exponierten Eltern durchgeführt. Im Vergleich zu natürlich vorkommenden Ionenkonzentrationen und -verhältnissen in Süßwassersystemen war die Befruchtungsrate der Eier für alle Ionenkombinationen signifikant niedriger, die Koagulations- und Deformationsrate jedoch signifikant höher. Die ELST ergaben bei den Embryonen und Larven u.a. vorzeitige und verlängerte Schlupfzeiten, verringerte Überlebensraten, erhöhte Deformationsraten und Herzschlagfrequenzen sowie Unregelmäßigkeiten des Ganzkörpergehalts von K, Mg, Na und Ca und des Ganzkörper-Ionenverhältnisses von Ca:Mg bei erhöhten Ionenkonzentrationen und Ionenimbalanzen. Im Vergleich zu den Effekten auf die Fortpflanzung und Entwicklung der Nachkommen waren die Effekte auf die adulten Tiere moderat. Die Ergebnisse dieser Dissertation zeigen, dass Teillebenszyklus-Tests besser als Fischeitests oder ELST geeignet sind, die Effekte von durch Abwasser verursachten Ionenimbalanzen, auf die Fortpflanzung und frühe Entwicklung von Süßwasserfischen, zu untersuchen. Weder die momentan gültigen noch die zukünftig herabgesetzten Grenzwerte bis 2027 sind danach als unschädlich, für die Reproduktion von Süßwasserfischen, zu betrachten.
The potash mining industry discharges saline effluents which generate ion imbalances in natural freshwater systems and cause severe secondary salinization in the river Werra in Germany. The effects of these ion imbalances on reproduction of freshwater fish were investigated using the fish model species Danio rerio. Five different combinations of elevated ion concentrations adjusted to the current threshold values for the discharge of potash mining effluents in Germany were tested. During a partial life cycle test, adult fish were exposed to the salt combinations for 35 days. Subsequently, the offspring were exposed to the same concentrations until hatch, and the larvae were further reared at the exposure concentrations from hatch until the 8th day post fertilization. Additionally, a standard early life stage test with offspring from unexposed parents was performed. Compared to naturally occurring ion concentrations and ratios in freshwater systems, the fertilization rate of the eggs was significantly lower for all ion combinations, while coagulation and deformation rates were significantly higher. Early life stage tests on embryos and larvae revealed premature and prolonged hatching times, reduced survival rates, increased deformation and heart rates and irregularities in whole body content of K, Mg, Na and Ca and whole body Ca:Mg ratios at elevated ion concentrations and imbalances of ion ratios. Compared to effects on reproduction and development of the offspring, effects on the parental generation were moderate. The results of this dissertation indicate that partial life cycle tests instead of fish egg tests or ELST are needed to examine most sensitively the effects of ion imbalances caused by potash mining effluents on reproduction and early development of freshwater fish. Neither the recent German threshold values nor the future reduced values until 2027 are safe for the reproduction of freshwater fish.
APA, Harvard, Vancouver, ISO, and other styles
32

Wärn, Caroline. "Deviating time-to-onset in predictive models : detecting new adverse effects from medicines." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-257100.

Full text
Abstract:
Identifying previously unknown adverse drug reactions becomes more important as the number of drugs and the extent of their use increases. The aim of this Master’s thesis project was to evaluate the performance of a novel approach for highlighting potential adverse drug reactions, also known as signal detection. The approach was based on deviating time-to-onset patterns and was implemented as a two-sample Kolmogorov-Smirnov test for non-vaccine data in the safety report database, VigiBase. The method was outperformed by both disproportionality analysis and the multivariate predictive model vigiRank. Performance estimates indicate that deviating time-to-onset patterns is not a suitable approach for signal detection for non-vaccine data in VigiBase.
APA, Harvard, Vancouver, ISO, and other styles
33

Salehian, Ali. "PREDICTING THE DYNAMIC BEHAVIOR OF COAL MINE TAILINGS USING STATE-OF-PRACTICE GEOTECHNICAL FIELD METHODS." UKnowledge, 2013. http://uknowledge.uky.edu/ce_etds/9.

Full text
Abstract:
This study is focused on developing a method to predict the dynamic behavior of mine tailings dams under earthquake loading. Tailings dams are a by-product of coal mining and processing activities. Mine tailings impoundments are prone to instability and failure under seismic loading as a result of the mechanical behavior of the tailings. Due to the existence of potential seismic sources in close proximity to the coal mining regions in the United States, it is necessary to assess the post-earthquake stability of these tailings dams. To develop the aforementioned methodology, 34 cyclic triaxial tests along with vane shear tests were performed on undisturbed mine tailings specimens from two impoundments in Kentucky. Therefore, the liquefaction resistance and the residual shear strength of the specimens were measured. The laboratory cyclic strength curves for the coal mine specimens were produced, and the relationship between plasticity, density, cyclic stress ratio, and number of cycles to liquefaction were identified. The samples from the Big Branch impoundment were generally loose samples, while the Abner Fork specimens were dense samples, older and slightly cemented. The data suggest that the number of loading cycles required to initiate liquefaction in mine tailings, NL, decreases with increasing CSR and with decreasing density. This trend is similar to what is typically observed in soil. For a number of selected specimens, using the results of a series of small-strain cyclic triaxial tests, the shear modulus reduction curves and damping ratio plots were created. The data obtained from laboratory experiments were correlated to the previously recorded geotechnical field data from the two impoundments. The field parameters including the SPT blow counts (N1)60, corrected CPT cone tip resistance (qt), and shear wave velocity (vs), were correlated to the laboratory measured cyclic resistance ratio (CRR). The results indicate that in general, the higher the (N1)60 and the tip resistance (qt), the higher the CSR was. Ultimately, practitioners will be able to use these correlations along with common state-of-practice geotechnical field methods to predict cyclic resistance in fine tailings to assess the liquefaction potential and post-earthquake stability of the impoundment structures.
APA, Harvard, Vancouver, ISO, and other styles
34

Al-Ajmi, Adel. "Wellbore stability analysis based on a new true-triaxial failure criterion." Doctoral thesis, Stockholm : Department of Land and Water Resources Engineering, Royal Institute of Technology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4037.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Lu, Zhiyong. "Text mining on GeneRIFs /." Connect to full text via ProQuest. Limited to UCD Anschutz Medical Campus, 2007.

Find full text
Abstract:
Thesis (Ph.D. in ) -- University of Colorado Denver, 2007.
Typescript. Includes bibliographical references (leaves 174-182). Free to UCD affiliates. Online version available via ProQuest Digital Dissertations;
APA, Harvard, Vancouver, ISO, and other styles
36

Gonçalves, Lea Silvia Martins. "Categorização em Text Mining." Universidade de São Paulo, 2002. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-22062015-202748/.

Full text
Abstract:
Os avanços tecnológicos e científicos ocorridos nas últimas décadas têm proporcionado o desenvolvimento de métodos cada vez mais eficientes para o armazenamento e processamento de dados. Através da análise e interpretação dos dados, é possível obter o conhecimento. Devido o conhecimento poder auxiliar a tomada de decisão, ele se tornou um elemento de fundamental importância para diversas organizações. Uma grande parte dos dados disponíveis hoje se encontra na forma textual, exemplo disso é o crescimento vertiginoso no que se refere à internet. Como os textos são dados não estruturados, é necessário realizar uma série de passos para transformá-los em dados estruturados para uma possível análise. O processo denominado de Text Mining é uma tecnologia emergente e visa analisar grandes coleções de documentos. Esta dissertação de mestrado aborda a utilização de diferentes técnicas e ferramentas para Text Mining. Em conjunto com o módulo de Pré-processamento de textos, projetado e implementado por Imamura (2001), essas técnicas e ferramentas podem ser utilizadas para textos em português. São explorados alguns algoritmos utilizados para extração de conhecimento de dados, \"como: Vizinho mais Próximo, Naive Bayes, Árvore de Decisão, Regras de Decisão, Tabelas de Decisão e Support Vector Machines. Para verificar o comportamento desses algoritmos para textos em português, foram realizados alguns experimentos.
The technological and scientific progresses that happened in the last decades have been providing the development of methods that are more and more efficient for the storage and processing of data. It is possible to obtain knowledge through the analysis and interpretation of the data. Knowledge has become an element of fundamental importance for several organizations, due to its aiding in decision making. Most of the data available today are found in textual form, an example of this is the Internet vertiginous growth. As the texts are not structured data, it is necessary to accomplish a series of steps to transform them in structured data for a possible analysis. The process entitled Text Mining is an emergent technology and aims at analyzing great collections of documents. This masters dissertation approaches the use of different techniques and tools for Text Mining, which together with the Text pre-processing module projected and implemented by Imamura (2001), can be used for texts in Portuguese. Some algorithms, used for knowledge extraction of data, such as: Nearest Neighbor, Naive Bayes, Decision Tree, Decision Rule, Decision Table and Support Vector Machines, are explored. To verify the behavior of these algorithms for texts in Portuguese, some experiments were realized.
APA, Harvard, Vancouver, ISO, and other styles
37

Fassauer, Roland. "Personalisierung im E-Commerce – zur Wirkung von E-Mail-Personalisierung auf ausgewählte ökonomische Kennzahlen des Konsumentenverhaltens." Doctoral thesis, Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-203512.

Full text
Abstract:
Personalisierung ist ein wichtiger Bereich des Internet Marketings, zu dem es wenige experimentelle Untersuchungen mit großen Teilnehmerzahlen gibt. Für den erfolgreichen Einsatz von Empfehlungsverfahren sind umfangreiche Daten über das Käuferverhalten erforderlich. Diesen Problemstellungen nimmt sich die vorliegende Arbeit an. In ihr wird das Shop-übergreifende individuelle Käuferverhalten von bis zu 126.000 Newsletter-Empfängern eines deutschen Online-Bonussystems sowohl mittels ausgewählter Data-Mining-Methoden als auch experimentell untersucht. Dafür werden Prototypen eines Data-Mining-Systems, einer A/B-Test-Software-Komponente und einer Empfehlungssystem-Komponente entwickelt und im Rahmen des Data Minings und durch Online-Feldexperimente evaluiert. Dabei kann für die genannte Nutzergruppe in einem Experiment bereits mit einem einfachen Empfehlungsverfahren gezeigt werden, dass zum einen die Shop-übergreifenden individuellen Verhaltensdaten des Online-Bonus-Systems für die Erzeugung von Empfehlungen geeignet sind, und zum anderen, dass die dadurch erzeugten Empfehlungen zu signifikant mehr Bestellungen als bei der besten Empfehlung auf Basis durchschnittlichen Käuferverhaltens führten. In weiteren Experimenten im Rahmen der Evaluierung der A/B-Test-Komponente konnte gezeigt werden, dass absolute Rabattangebote nur dann zu signifikant mehr Bestellungen führten als relative Rabatt-Angebote, wenn sie mit einer Handlungsaufforderung verbunden waren. Die Arbeit ordnet sich damit in die Forschung zur Beeinflussung des Käuferverhaltens durch Personalisierung und durch unterschiedliche Rabatt-Darstellungen ein und trägt die genannten Ergebnisse und Artefakte bei.
APA, Harvard, Vancouver, ISO, and other styles
38

SOARES, FABIO DE AZEVEDO. "AUTOMATIC TEXT CATEGORIZATION BASED ON TEXT MINING." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2013. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=23213@1.

Full text
Abstract:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
A Categorização de Documentos, uma das tarefas desempenhadas em Mineração de Textos, pode ser descrita como a obtenção de uma função que seja capaz de atribuir a um documento uma categoria a que ele pertença. O principal objetivo de se construir uma taxonomia de documentos é tornar mais fácil a obtenção de informação relevante. Porém, a implementação e a execução de um processo de Categorização de Documentos não é uma tarefa trivial: as ferramentas de Mineração de Textos estão em processo de amadurecimento e ainda, demandam elevado conhecimento técnico para a sua utilização. Além disso, exercendo grande importância em um processo de Mineração de Textos, a linguagem em que os documentos se encontram escritas deve ser tratada com as particularidades do idioma. Contudo há grande carência de ferramentas que forneçam tratamento adequado ao Português do Brasil. Dessa forma, os objetivos principais deste trabalho são pesquisar, propor, implementar e avaliar um framework de Mineração de Textos para a Categorização Automática de Documentos, capaz de auxiliar a execução do processo de descoberta de conhecimento e que ofereça processamento linguístico para o Português do Brasil.
Text Categorization, one of the tasks performed in Text Mining, can be described as the achievement of a function that is able to assign a document to the category, previously defined, to which it belongs. The main goal of building a taxonomy of documents is to make easier obtaining relevant information. However, the implementation and execution of Text Categorization is not a trivial task: Text Mining tools are under development and still require high technical expertise to be handled, also having great significance in a Text Mining process, the language of the documents should be treated with the peculiarities of each idiom. Yet there is great need for tools that provide proper handling to Portuguese of Brazil. Thus, the main aims of this work are to research, propose, implement and evaluate a Text Mining Framework for Automatic Text Categorization, capable of assisting the execution of knowledge discovery process and provides language processing for Brazilian Portuguese.
APA, Harvard, Vancouver, ISO, and other styles
39

Baker, Simon. "Semantic text classification for cancer text mining." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275838.

Full text
Abstract:
Cancer researchers and oncologists benefit greatly from text mining major knowledge sources in biomedicine such as PubMed. Fundamentally, text mining depends on accurate text classification. In conventional natural language processing (NLP), this requires experts to annotate scientific text, which is costly and time consuming, resulting in small labelled datasets. This leads to extensive feature engineering and handcrafting in order to fully utilise small labelled datasets, which is again time consuming, and not portable between tasks and domains. In this work, we explore emerging neural network methods to reduce the burden of feature engineering while outperforming the accuracy of conventional pipeline NLP techniques. We focus specifically on the cancer domain in terms of applications, where we introduce two NLP classification tasks and datasets: the first task is that of semantic text classification according to the Hallmarks of Cancer (HoC), which enables text mining of scientific literature assisted by a taxonomy that explains the processes by which cancer starts and spreads in the body. The second task is that of the exposure routes of chemicals into the body that may lead to exposure to carcinogens. We present several novel contributions. We introduce two new semantic classification tasks (the hallmarks, and exposure routes) at both sentence and document levels along with accompanying datasets, and implement and investigate a conventional pipeline NLP classification approach for both tasks, performing both intrinsic and extrinsic evaluation. We propose a new approach to classification using multilevel embeddings and apply this approach to several tasks; we subsequently apply deep learning methods to the task of hallmark classification and evaluate its outcome. Utilising our text classification methods, we develop and two novel text mining tools targeting real-world cancer researchers. The first tool is a cancer hallmark text mining tool that identifies association between a search query and cancer hallmarks; the second tool is a new literature-based discovery (LBD) system designed for the cancer domain. We evaluate both tools with end users (cancer researchers) and find they demonstrate good accuracy and promising potential for cancer research.
APA, Harvard, Vancouver, ISO, and other styles
40

Zaghloul, Waleed A. Lee Sang M. "Text mining using neural networks." Lincoln, Neb. : University of Nebraska-Lincoln, 2005. http://0-www.unl.edu.library.unl.edu/libr/Dissertations/2005/Zaghloul.pdf.

Full text
Abstract:
Thesis (Ph.D.)--University of Nebraska-Lincoln, 2005.
Title from title screen (sites viewed on Oct. 18, 2005). PDF text: 100 p. : col. ill. Includes bibliographical references (p. 95-100 of dissertation).
APA, Harvard, Vancouver, ISO, and other styles
41

Al-Halimi, Reem Khalil. "Mining Topic Signals from Text." Thesis, University of Waterloo, 2003. http://hdl.handle.net/10012/1165.

Full text
Abstract:
This work aims at studying the effect of word position in text on understanding and tracking the content of written text. In this thesis we present two uses of word position in text: topic word selectors and topic flow signals. The topic word selectors identify important words, called topic words, by their spread through a text. The underlying assumption here is that words that repeat across the text are likely to be more relevant to the main topic of the text than ones that are concentrated in small segments. Our experiments show that manually selected keywords correspond more closely to topic words extracted using these selectors than to words chosen using more traditional indexing techniques. This correspondence indicates that topic words identify the topical content of the documents more than words selected using the traditional indexing measures that do not utilize word position in text. The second approach to applying word position is through topic flow signals. In this representation, words are replaced by the topics to which they refer. The flow of any one topic can then be traced throughout the document and viewed as a signal that rises when a word relevant to the topic is used and falls when an irrelevant word occurs. To reflect the flow of the topic in larger segments of text we use a simple smoothing technique. The resulting smoothed signals are shown to be correlated to the ideal topic flow signals for the same document. Finally, we characterize documents using the importance of their topic words and the spread of these words in the document. When incorporated into a Support Vector Machine classifier, this representation is shown to drastically reduce the vocabulary size and improve the classifier's performance compared to the traditional word-based, vector space representation.
APA, Harvard, Vancouver, ISO, and other styles
42

Rice, Simon B. "Text data mining in bioinformatics." Thesis, University of Manchester, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.488351.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Meyer, David, Kurt Hornik, and Ingo Feinerer. "Text Mining Infrastructure in R." American Statistical Association, 2008. http://epub.wu.ac.at/3978/1/textmining.pdf.

Full text
Abstract:
During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R. We give a survey on text mining facilities in R and explain how typical application tasks can be carried out using our framework. We present techniques for count-based analysis methods, text clustering, text classiffication and string kernels. (authors' abstract)
APA, Harvard, Vancouver, ISO, and other styles
44

Theußl, Stefan, Ingo Feinerer, and Kurt Hornik. "Distributed Text Mining in R." WU Vienna University of Economics and Business, 2011. http://epub.wu.ac.at/3034/1/Theussl_etal%2D2011%2Dpreprint.pdf.

Full text
Abstract:
R has recently gained explicit text mining support with the "tm" package enabling statisticians to answer many interesting research questions via statistical analysis or modeling of (text) corpora. However, we typically face two challenges when analyzing large corpora: (1) the amount of data to be processed in a single machine is usually limited by the available main memory (i.e., RAM), and (2) an increase of the amount of data to be analyzed leads to increasing computational workload. Fortunately, adequate parallel programming models like MapReduce and the corresponding open source implementation called Hadoop allow for processing data sets beyond what would fit into memory. In this paper we present the package "tm.plugin.dc" offering a seamless integration between "tm" and Hadoop. We show on the basis of an application in culturomics that we can efficiently handle data sets of significant size.
Series: Research Report Series / Department of Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
45

Martins, Bruno. "Geographically Aware Web Text Mining." Master's thesis, Department of Informatics, University of Lisbon, 2009. http://hdl.handle.net/10451/14301.

Full text
Abstract:
Text mining and search have become important research areas over the past few years, mostly due to the large popularity of the Web. A natural extension for these technologies is the development of methods for exploring the geographic context of Web information. Human information needs often present specific geographic constraints. Many Web documents also refer to speci c locations. However, relatively little e ort has been spent on developing the facilities required for geographic access to unstructured textual information. Geographically aware text mining and search remain relatively unexplored. This thesis addresses this new area, arguing that Web text mining can be applied to extract geographic context information, and that this information can be explored for information retrieval. Fundamental questions investigated include handling geographic references in text, assigning geographic scopes to the documents, and building retrieval applications that handle/use geographic scopes. The thesis presents appropriate solutions for each of these challenges, together with a comprehensive evaluation of their efectiveness. By investigating these questions, the thesis presents several findings on how the geographic context can be efectively handled by text processing tools.
APA, Harvard, Vancouver, ISO, and other styles
46

Munyana, Nicole. "Le text mining et XML." Thèse, Trois-Rivières : Université du Québec à Trois-Rivières, 2007. http://www.uqtr.ca/biblio/notice/resume/30024815R.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Schieber, Andreas, and Paul Kruse. "Idea Mining." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2014. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-140499.

Full text
Abstract:
Motiviert durch den Erfolg des Web 2.0 und Social Media in vielen Bereichen des öffentlichen Lebens und der damit verbundenen Open-Innovation-Bewegung, die Kunden aktiv in den Innovationsprozess einbezieht, schlägt dieser Beitrag eine Integration von Wissensmanagement und Text Mining zur Verbesserung dieses Innovationsprozesses vor. Durch den beschriebenen Ansatz werden Kunden nicht nur motiviert, ihre Ideen und Bedürfnisse auf webbasierten Kommunikationsplattformen preiszugeben, sondern die entstehenden, textbasierten Daten können automatisiert ausgewertet und zur zielgerichteten und zeitnahen Weiterentwicklung der Produkte eingesetzt werden. Anhand zweier Anwendungsszenarien aus der Praxis werden das resultierende Prozessmodell dargestellt und dessen Potenziale veranschaulicht.
APA, Harvard, Vancouver, ISO, and other styles
48

McDonald, Daniel Merrill. "Combining Text Structure and Meaning to Support Text Mining." Diss., The University of Arizona, 2006. http://hdl.handle.net/10150/194015.

Full text
Abstract:
Text mining methods strive to make unstructured text more useful for decision making. As part of the mining process, language is processed prior to analysis. Processing techniques have often focused primarily on either text structure or text meaning in preparing documents for analysis. As approaches have evolved over the years, increases in the use of lexical semantic parsing usually have come at the expense of full syntactic parsing. This work explores the benefits of combining structure and meaning or syntax and lexical semantics to support the text mining process.Chapter two presents the Arizona Summarizer, which includes several processing approaches to automatic text summarization. Each approach has varying usage of structural and lexical semantic information. The usefulness of the different summaries is evaluated in the finding stage of the text mining process. The summary produced using structural and lexical semantic information outperforms all others in the browse task. Chapter three presents the Arizona Relation Parser, a system for extracting relations from medical texts. The system is a grammar-based system that combines syntax and lexical semantic information in one grammar for relation extraction. The relation parser attempts to capitalize on the high precision performance of semantic systems and the good coverage of the syntax-based systems. The parser performs in line with the top reported systems in the literature. Chapter four presents the Arizona Entity Finder, a system for extracting named entities from text. The system greatly expands on the combination grammar approach from the relation parser. Each tag is given a semantic and syntactic component and placed in a tag hierarchy. Over 10,000 tags exist in the hierarchy. The system is tested on multiple domains and is required to extract seven additional types of entities in the second corpus. The entity finder achieves a 90 percent F-measure on the MUC-7 data and an 87 percent F-measure on the Yahoo data where additional entity types were extracted.Together, these three chapters demonstrate that combining text structure and meaning in algorithms to process language has the potential to improve the text mining process. A lexical semantic grammar is effective at recognizing domain-specific entities and language constructs. Syntax information, on the other hand, allows a grammar to generalize its rules when possible. Balancing performance and coverage in light of the world's growing body of unstructured text is important.
APA, Harvard, Vancouver, ISO, and other styles
49

Olsson, Elin. "Deriving Genetic Networks Using Text Mining." Thesis, University of Skövde, Department of Computer Science, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-708.

Full text
Abstract:

On the Internet an enormous amount of information is available that is represented in an unstructured form. The purpose with a text mining tool is to collect this information and present it in a more structured form. In this report text mining is used to create an algorithm that searches abstracts available from PubMed and finds specific relationships between genes that can be used to create a network. The algorithm can also be used to find information about a specific gene. The network created by Mendoza et al. (1999) was verified in all the connections but one using the algorithm. This connection contained implicit information. The results suggest that the algorithm is better at extracting information about specific genes than finding connections between genes. One advantage with the algorithm is that it can also find connections between genes and proteins and genes and other chemical substances.

APA, Harvard, Vancouver, ISO, and other styles
50

Fivelstad, Ole Kristian. "Temporal Text Mining : The TTM Testbench." Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-8764.

Full text
Abstract:

This master thesis presents the Temporal Text Mining(TTM) Testbench, an application for discovering association rules in temporal document collections. It is a continuation of work done in a project the fall of 2005 and work done in a project the fall of 2006. These projects have laid the foundation for this thesis. The focus of the work is on identifying and extracting meaningful terms from textual documents to improve the meaningfulness of the mined association rules. Much work has been done to compile the theoretical foundation of this project. This foundation has been used for assessing different approaches for finding meaningful and descriptive terms. The old TTM Testbench has been extended to include usage of WordNet, and operations for finding collocations, performing word sense disambiguation, and for extracting higher-level concepts and categories from the individual documents. A method for rating association rules based on the semantic similarity of the terms present in the rules has also been implemented. This was done in an attempt to narrow down the result set, and filter out rules which are not likely to be interesting. Experiments performed with the improved application shows that the usage of WordNet and the new operations can help increase the meaningfulness of the rules. One factor which plays a big part in this, is that synonyms of words are added to make the term more understandable. However, the experiments showed that it was difficult to decide if a rule was interesting or not, this made it impossible to draw any conclusions regarding the suitability of semantic similarity for finding interesting rules. All work on the TTM Testbench so far has focused on finding association rules in web newspapers. It may however be useful to perform experiments in a more limited domain, for example medicine, where the interestingness of a rule may be more easily decided.

APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography