Academic literature on the topic 'Approximate Mining'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Approximate Mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Dissertations / Theses on the topic "Approximate Mining"

1

Kolli, Lakshmi Priya. "Mining for Frequent Community Structures using Approximate Graph Matching." University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1623166375110273.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Agarwal, Khushbu. "A partition based approach to approximate tree mining a memory hierarchy perspective /." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1196284256.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Agarwal, Khushbu. "A partition based approach to approximate tree mining : a memory hierarchy perspective." The Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=osu1196284256.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Carraher, Lee A. "Approximate Clustering Algorithms for High Dimensional Streaming and Distributed Data." University of Cincinnati / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1511860805777818.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Fuhry, David P. "PLASMA-HD: Probing the LAttice Structure and MAkeup of High-dimensional Data." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1440431146.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sartipi, Kamran. "Software Architecture Recovery based on Pattern Matching." Thesis, University of Waterloo, 2003. http://hdl.handle.net/10012/1122.

Full text
Abstract:
Pattern matching approaches in reverse engineering aim to incorporate domain knowledge and system documentation in the software architecture extraction process, hence provide a user/tool collaborative environment for architectural design recovery. This thesis presents a model and an environment for recovering the high level design of legacy software systems based on user defined architectural patterns and graph matching techniques. In the proposed model, a high-level view of a software system in terms of the system components and their interactions is represented as a query, using a description language. A query is mapped onto a pattern-graph, where a module and its interactions with other modules are represented as a group of graph-nodes and a group of graph-edges, respectively. Interaction constraints can be modeled by the description language as a part of the query. Such a pattern-graph is applied against an entity-relation graph that represents the information extracted from the source code of the software system. An approximate graph matching process performs a series of graph edit operations (i. e. , node/edge insertion/deletion) on the pattern-graph and uses a ranking mechanism based on data mining association to obtain a sub-optimal solution. The obtained solution corresponds to an extracted architecture that complies with the given query. An interactive prototype toolkit implemented as part of this thesis provides an environment for architecture recovery in two levels. First the system is decomposed into a number of subsystems of files. Second each subsystem can be decomposed into a number of modules of functions, datatypes, and variables.
APA, Harvard, Vancouver, ISO, and other styles
7

Kuo, Fang-Chen, and 郭芳甄. "Mining Approximate Frequent Itemsets from Noisy Data." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/6waagh.

Full text
Abstract:
碩士<br>東吳大學<br>資訊科學系<br>96<br>To discover association rules, frequent itemset mining can find out items that appear frequently together in a dataset and to be the first step in the analysis of data arising in broad range of application. Moreover, industry can use mining result to improve marketing strategy and profitability. Traditional frequent itemset mining utilizes the “exact” mode. However, the exact-mode mining is not appropriate for real data. Mining noisy data using the exact mode cannot generate correct frequent itemsets, and may eventually lead to incorrect decisions. In recent years, many researchers have studied how to discover frequent itemsets from noisy data. However, existing methods can become inefficient when the dataset is sparse. Therefore, these methods cannot be applied to all kinds of datasets. In this paper, we propose a new algorithm, called the TAFI algorithm, for mining approximate frequent itemsets. The TAFI algorithm not only can correctly and efficiently discover approximate frequent itemsets from noisy data, but also can perform well with spare datasets.
APA, Harvard, Vancouver, ISO, and other styles
8

Mantovani, Matteo. "Approximate Data Mining Techniques on Clinical Data." Doctoral thesis, 2020. http://hdl.handle.net/11562/1018039.

Full text
Abstract:
The past two decades have witnessed an explosion in the number of medical and healthcare datasets available to researchers and healthcare professionals. Data collection efforts are highly required, and this prompts the development of appropriate data mining techniques and tools that can automatically extract relevant information from data. Consequently, they provide insights into various clinical behaviors or processes captured by the data. Since these tools should support decision-making activities of medical experts, all the extracted information must be represented in a human-friendly way, that is, in a concise and easy-to-understand form. To this purpose, here we propose a new framework that collects different new mining techniques and tools proposed. These techniques mainly focus on two aspects: the temporal one and the predictive one. All of these techniques were then applied to clinical data and, in particular, ICU data from MIMIC III database. It showed the flexibility of the framework, which is able to retrieve different outcomes from the overall dataset. The first two techniques rely on the concept of Approximate Temporal Functional Dependencies (ATFDs). ATFDs have been proposed, with their suitable treatment of temporal information, as a methodological tool for mining clinical data. An example of the knowledge derivable through dependencies may be "within 15 days, patients with the same diagnosis and the same therapy usually receive the same daily amount of drug". However, current ATFD models are not analyzing the temporal evolution of the data, such as "For most patients with the same diagnosis, the same drug is prescribed after the same symptom". To this extent, we propose a new kind of ATFD called Approximate Pure Temporally Evolving Functional Dependencies (APEFDs). Another limitation of such kind of dependencies is that they cannot deal with quantitative data when some tolerance can be allowed for numerical values. In particular, this limitation arises in clinical data warehouses, where analysis and mining have to consider one or more measures related to quantitative data (such as lab test results and vital signs), concerning multiple dimensional (alphanumeric) attributes (such as patient, hospital, physician, diagnosis) and some time dimensions (such as the day since hospitalization and the calendar date). According to this scenario, we introduce a new kind of ATFD, named Multi-Approximate Temporal Functional Dependency (MATFD), which considers dependencies between dimensions and quantitative measures from temporal clinical data. These new dependencies may provide new knowledge as "within 15 days, patients with the same diagnosis and the same therapy receive a daily amount of drug within a fixed range". The other techniques are based on pattern mining, which has also been proposed as a methodological tool for mining clinical data. However, many methods proposed so far focus on mining of temporal rules which describe relationships between data sequences or instantaneous events, without considering the presence of more complex temporal patterns into the dataset. These patterns, such as trends of a particular vital sign, are often very relevant for clinicians. Moreover, it is really interesting to discover if some sort of event, such as a drug administration, is capable of changing these trends and how. To this extent, we propose a new kind of temporal patterns, called Trend-Event Patterns (TEPs), that focuses on events and their influence on trends that can be retrieved from some measures, such as vital signs. With TEPs we can express concepts such as "The administration of paracetamol on a patient with an increasing temperature leads to a decreasing trend in temperature after such administration occurs". We also decided to analyze another interesting pattern mining technique that includes prediction. This technique discovers a compact set of patterns that aim to describe the condition (or class) of interest. Our framework relies on a classification model that considers and combines various predictive pattern candidates and selects only those that are important to improve the overall class prediction performance. We show that our classification approach achieves a significant reduction in the number of extracted patterns, compared to the state-of-the-art methods based on minimum predictive pattern mining approach, while preserving the overall classification accuracy of the model. For each technique described above, we developed a tool to retrieve its kind of rule. All the results are obtained by pre-processing and mining clinical data and, as mentioned before, in particular ICU data from MIMIC III database.
APA, Harvard, Vancouver, ISO, and other styles
9

Liu, Chunyang. "Summarizing data with representative patterns." Thesis, 2016. http://hdl.handle.net/10453/52923.

Full text
Abstract:
University of Technology Sydney. Faculty of Engineering and Information Technology.<br>The advance of technology makes data acquisition and storage become unprecedentedly convenient. It contributes to the rapid growth of not only the volume but also the veracity and variety of data in recent years, which poses new challenges to the data mining area. For example, uncertain data mining emerges due to its capability to model the inherent veracity of data; spatial data mining attracts much research attention as the widespread of location-based services and wearable devices. As a fundamental topic of data mining, how to effectively and efficiently summarize data in this situation still remains to be explored. This thesis studied the problem of summarizing data with representative patterns. The objective is to find a set of patterns, which is much more concise but still contains rich information of the original data, and may provide valuable insights for further analysis of data. In the light of this idea, we formally formulate the problem and provide effective and efficient solutions in various scenarios. We study the problem of summarizing probabilistic frequent patterns over uncertain data. Probabilistic frequent pattern mining over uncertain data has received much research attention due to the wide applicabilities of uncertain data. It suffers from the problem of generating an exponential number of result patterns, which hinders the analysis of patterns and calls for the need to find a small number of representative patterns to approximate all other patterns. We formally formulate the problem of probabilistic representative frequent pattern (P-RFP) mining, which aims to find the minimal set of patterns with sufficiently high probability to represent all other patterns. The bottleneck turns out to be checking whether a pattern can probabilistically represent another, which involves the computation of a joint probability of the supports of two patterns. We propose a novel dynamic programming-based approach to address the problem and devise effective optimization strategies to improve the computation efficiency. To enhance the practicability of P-RFP mining, we introduce a novel approximation of the joint probability with both theoretical and empirical proofs. Based on the approximation, we propose an Approximate P-RFP Mining (APM) algorithm, which effectively and efficiently compresses the probabilistic frequent pattern set. The error rate of APM is guaranteed to be very small when the database contains hundreds of transactions, which further affirms that APM is a practical solution for summarizing probabilistic frequent patterns. We address the problem of directly summarizing uncertain transaction database by formulating the problem as Minimal Probabilistic Tile Cover Mining, which aims to find a high-quality probabilistic tile set covering an uncertain database with minimal cost. We define the concept of Probabilistic Price and Probabilistic Price Order to evaluate and compare the quality of tiles, and propose a framework to discover the minimal probabilistic tile cover. The bottleneck is to check whether a tile is better than another according to the Probabilistic Price Order, which involves the computation of a joint probability. We prove that it can be decomposed into independent terms and calculated efficiently. Several optimization techniques are devised to further improve the performance. We analyze the problem of summarizing co-locations mined from spatial databases. Co-location pattern mining finds patterns of spatial features whose instances tend to locate together in geographic space. However, the traditional framework of co-location pattern mining produces an exponential number of patterns because of the downward closure property, which makes it difficult for users to understand, assess or apply the huge number of resulted patterns. To address this issue, we study the problem of mining representative co-location patterns (RCP). We first define a covering relationship between two co-location patterns then formally formulate the problem of Representative Co-location Pattern mining. To solve the problem of RCP mining, we propose the RCPFast algorithm adopting the post-mining framework and the RCPMS algorithm pushing pattern summarization into the co-location mining process.
APA, Harvard, Vancouver, ISO, and other styles
10

董原賓. "Approximately mining frequent representative itemsets on data streams." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/31159769915805928034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
More sources
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography