Tesis sobre el tema "Spatial data mining"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Spatial data mining".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Zhang, Xin Iris y 張欣. "Fast mining of spatial co-location patterns". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B30462708.
Texto completoYang, Zhao. "Spatial Data Mining Analytical Environment for Large Scale Geospatial Data". ScholarWorks@UNO, 2016. http://scholarworks.uno.edu/td/2284.
Texto completoAl-Naymat, Ghazi. "NEW METHODS FOR MINING SEQUENTIAL AND TIME SERIES DATA". Thesis, The University of Sydney, 2009. http://hdl.handle.net/2123/5295.
Texto completoAl-Naymat, Ghazi. "NEW METHODS FOR MINING SEQUENTIAL AND TIME SERIES DATA". University of Sydney, 2009. http://hdl.handle.net/2123/5295.
Texto completoData mining is the process of extracting knowledge from large amounts of data. It covers a variety of techniques aimed at discovering diverse types of patterns on the basis of the requirements of the domain. These techniques include association rules mining, classification, cluster analysis and outlier detection. The availability of applications that produce massive amounts of spatial, spatio-temporal (ST) and time series data (TSD) is the rationale for developing specialized techniques to excavate such data. In spatial data mining, the spatial co-location rule problem is different from the association rule problem, since there is no natural notion of transactions in spatial datasets that are embedded in continuous geographic space. Therefore, we have proposed an efficient algorithm (GridClique) to mine interesting spatial co-location patterns (maximal cliques). These patterns are used as the raw transactions for an association rule mining technique to discover complex co-location rules. Our proposal includes certain types of complex relationships – especially negative relationships – in the patterns. The relationships can be obtained from only the maximal clique patterns, which have never been used until now. Our approach is applied on a well-known astronomy dataset obtained from the Sloan Digital Sky Survey (SDSS). ST data is continuously collected and made accessible in the public domain. We present an approach to mine and query large ST data with the aim of finding interesting patterns and understanding the underlying process of data generation. An important class of queries is based on the flock pattern. A flock is a large subset of objects moving along paths close to each other for a predefined time. One approach to processing a “flock query” is to map ST data into high-dimensional space and to reduce the query to a sequence of standard range queries that can be answered using a spatial indexing structure; however, the performance of spatial indexing structures rapidly deteriorates in high-dimensional space. This thesis sets out a preprocessing strategy that uses a random projection to reduce the dimensionality of the transformed space. We use probabilistic arguments to prove the accuracy of the projection and to present experimental results that show the possibility of managing the curse of dimensionality in a ST setting by combining random projections with traditional data structures. In time series data mining, we devised a new space-efficient algorithm (SparseDTW) to compute the dynamic time warping (DTW) distance between two time series, which always yields the optimal result. This is in contrast to other approaches which typically sacrifice optimality to attain space efficiency. The main idea behind our approach is to dynamically exploit the existence of similarity and/or correlation between the time series: the more the similarity between the time series, the less space required to compute the DTW between them. Other techniques for speeding up DTW, impose a priori constraints and do not exploit similarity characteristics that may be present in the data. Our experiments demonstrate that SparseDTW outperforms these approaches. We discover an interesting pattern by applying SparseDTW algorithm: “pairs trading” in a large stock-market dataset, of the index daily prices from the Australian stock exchange (ASX) from 1980 to 2002.
Koperski, Krzysztof. "A progressive refinement approach to spatial data mining". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0024/NQ51882.pdf.
Texto completoYang, Hui. "A general framework for mining spatial and spatio-temporal object association patterns in scientific data". Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1155319799.
Texto completoYu, Ping. "FP-tree Based Spatial Co-location Pattern Mining". Thesis, University of North Texas, 2005. https://digital.library.unt.edu/ark:/67531/metadc4724/.
Texto completoSHENCOTTAH, K. N. KALYANKUMAR. "FINDING CLUSTERS IN SPATIAL DATA". University of Cincinnati / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1179521337.
Texto completoLin, Zhungshan. "Optimal Candidate Generation in Spatial Co-Location Mining". DigitalCommons@USU, 2009. https://digitalcommons.usu.edu/etd/377.
Texto completoPech, Palacio Manuel Alfredo. "Spatial data modeling and mining using a graph-based representation". Lyon, INSA, 2005. http://theses.insa-lyon.fr/publication/2005ISAL0118/these.pdf.
Texto completoWe propose a unique graph-based model to represent spatial data, non-spatial data and the spatial relations among spatial objects. We will generate datasets composed of graphs with a set of these three elements. We consider that by mining a dataset with these characteristics a graph-based mining tool can search patterns involving all these elements at the same time improving the results of the spatial analysis task. A significant characteristic of spatial data is that the attributes of the neighbors of an object may have an influence on the object itself. So, we propose to include in the model three relationship types (topological, orientation, and distance relations). In the model the spatial data (i. E. Spatial objects), non-spatial data (i. E. Non-spatial attributes), and spatial relations are represented as a collection of one or more directed graphs. A directed graph contains a collection of vertices and edges representing all these elements. Vertices represent either spatial objects, spatial relations between two spatial objects (binary relation), or non-spatial attributes describing the spatial objects. Edges represent a link between two vertices of any type. According to the type of vertices that an edge joins, it can represent either an attribute name or a spatial relation name. The attribute name can refer to a spatial object or a non-spatial entity. We use directed edges to represent directional information of relations among elements (i. E. Object x touches object y) and to describe attributes about objects (i. E. Object x has attribute z). We propose to adopt the Subdue system, a general graph-based data mining system developed at the University of Texas at Arlington, as our mining tool. A special feature named overlap has a primary role in the substructures discovery process and consequently a direct impact over the generated results. However, it is currently implemented in an orthodox way: all or nothing. Therefore, we propose a third approach: limited overlap, which gives the user the capability to set over which vertices the overlap will be allowed. We visualize directly three motivations issues to propose the implementation of the new algorithm: search space reduction, processing time reduction, and specialized overlapping pattern oriented search
Pech, Palacio Manuel Alfredo Laurini Robert Tchounikine Anne Sol Martínez David. "Spatial data modeling and mining using a graph-based representation". Villeurbanne : Doc'INSA, 2006. http://docinsa.insa-lyon.fr/these/pont.php?id=pech_palacio.
Texto completoThèse soutenue en co-tutelle. Thèse rédigée en français, en anglais et en espagnol. Titre provenant de l'écran-titre. Bibliogr. p. 174-182.
Kou, Yufeng. "Abnormal Pattern Recognition in Spatial Data". Diss., Virginia Tech, 2006. http://hdl.handle.net/10919/30145.
Texto completoPh. D.
Sips, Mike. "Pixel-based visual data mining in large geo-spatial point sets /". Konstanz : Hartung-Gorre, 2006. http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&doc_number=014881714&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA.
Texto completoRuß, Georg [Verfasser] y Rudolf [Akademischer Betreuer] Kruse. "Spatial data mining in precision agriculture / Georg Ruß. Betreuer: Rudolf Kruse". Magdeburg : Universitätsbibliothek, 2012. http://d-nb.info/1047596296/34.
Texto completoRuß, Georg Verfasser] y Rudolf [Akademischer Betreuer] [Kruse. "Spatial data mining in precision agriculture / Georg Ruß. Betreuer: Rudolf Kruse". Magdeburg : Universitätsbibliothek, 2012. http://nbn-resolving.de/urn:nbn:de:gbv:ma9:1-820.
Texto completoLan, Liang. "Data Mining Algorithms for Classification of Complex Biomedical Data". Diss., Temple University Libraries, 2012. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/214773.
Texto completoPh.D.
In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data. In microarray classification, samples belong to several predefined categories (e.g., cancer vs. control tissues) and the goal is to build a predictor that classifies a new tissue sample based on its microarray measurements. When faced with the small-sample high-dimensional microarray data, most machine learning algorithm would produce an overly complicated model that performs well on training data but poorly on new data. To reduce the risk of over-fitting, feature selection becomes an essential technique in microarray classification. However, standard feature selection algorithms are bound to underperform when the size of the microarray data is particularly small. The best remedy is to borrow strength from external microarray datasets. In this dissertation, I will present two new multi-task feature filter methods which can improve the classification performance by utilizing the external microarray data. The first method is to aggregate the feature selection results from multiple microarray classification tasks. The resulting multi-task feature selection can be shown to improve quality of the selected features and lead to higher classification accuracy. The second method jointly selects a small gene set with maximal discriminative power and minimal redundancy across multiple classification tasks by solving an objective function with integer constraints. In protein function prediction problem, gene functions are predicted from a predefined set of possible functions (e.g., the functions defined in the Gene Ontology). Gene function prediction is a complex classification problem characterized by the following aspects: (1) a single gene may have multiple functions; (2) the functions are organized in hierarchy; (3) unbalanced training data for each function (much less positive than negative examples); (4) missing class labels; (5) availability of multiple biological data sources, such as microarray data, genome sequence and protein-protein interactions. As participants in the 2011 Critical Assessment of Function Annotation (CAFA) challenge, our team achieved the highest AUC accuracy among 45 groups. In the competition, we gained by focusing on the 5-th aspect of the problem. Thus, in this dissertation, I will discuss several schemes to integrate the prediction scores from multiple data sources and show their results. Interestingly, the experimental results show that a simple averaging integration method is competitive with other state-of-the-art data integration methods. Original spatial scan algorithm is used for detection of spatial overdensities: discovery of spatial subregions with significantly higher scores according to some density measure. This algorithm is widely used in identifying cluster of disease cases (e.g., identifying environmental risk factors for child leukemia). However, the original spatial scan algorithm only works on static spatial data. In this dissertation, I will propose one possible solution for spatial scan on movement data.
Temple University--Theses
Sandell, Anna. "GIS, data mining and wild land fire data within Räddningstjänsten". Thesis, University of Skövde, Department of Computer Science, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-543.
Texto completoGeographical information systems (GIS), data mining and wild land fire would theoretically be suitable to use together. However, would data mining in reality bring out any useful information from wild land fire data stored within a GIS? In this report an investigation is done if GIS and data mining are used within Räddningstjänsten today in some municipalities of the former Skaraborg. The investigation shows that neither data mining nor GIS are used within the investigated municipalities. However, there is an interest in using GIS within the organisations in the future but also some kind of analysis tool, for example data mining. To show how GIS and data mining could be used in the future within Räddningstjänsten some examples on this were constructed.
Isik, Narin. "Fuzzy Spatial Data Cube Construction And Its Use In Association Rule Mining". Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606056/index.pdf.
Texto completohence, applications that assist decision-making about spatial data like weather forecasting, traffic supervision, mobile communication, etc. have been introduced. In this thesis, more natural and precise knowledge from spatial data is generated by construction of fuzzy spatial data cube and extraction of fuzzy association rules from it in order to improve decision-making about spatial data. This involves an extensive research about spatial knowledge discovery and how fuzzy logic can be used to develop it. It is stated that incorporating fuzzy logic to spatial data cube construction necessitates a new method for aggregation of fuzzy spatial data. We illustrate how this method also enhances the meaning of fuzzy spatial generalization rules and fuzzy association rules with a case-study about weather pattern searching. This study contributes to spatial knowledge discovery by generating more understandable and interesting knowledge from spatial data by extending spatial generalization with fuzzy memberships, extending the spatial aggregation in spatial data cube construction by utilizing weighted measures, and generating fuzzy association rules from the constructed fuzzy spatial data cube.
Demšar, Urška. "Data mining of geospatial data: combining visual and automatic methods". Doctoral thesis, KTH, School of Architecture and the Built Environment (ABE), 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3892.
Texto completoMost of the largest databases currently available have a strong geospatial component and contain potentially useful information which might be of value. The discipline concerned with extracting this information and knowledge is data mining. Knowledge discovery is performed by applying automatic algorithms which recognise patterns in the data.
Classical data mining algorithms assume that data are independently generated and identically distributed. Geospatial data are multidimensional, spatially autocorrelated and heterogeneous. These properties make classical data mining algorithms inappropriate for geospatial data, as their basic assumptions cease to be valid. Extracting knowledge from geospatial data therefore requires special approaches. One way to do that is to use visual data mining, where the data is presented in visual form for a human to perform the pattern recognition. When visual mining is applied to geospatial data, it is part of the discipline called exploratory geovisualisation.
Both automatic and visual data mining have their respective advantages. Computers can treat large amounts of data much faster than humans, while humans are able to recognise objects and visually explore data much more effectively than computers. A combination of visual and automatic data mining draws together human cognitive skills and computer efficiency and permits faster and more efficient knowledge discovery.
This thesis investigates if a combination of visual and automatic data mining is useful for exploration of geospatial data. Three case studies illustrate three different combinations of methods. Hierarchical clustering is combined with visual data mining for exploration of geographical metadata in the first case study. The second case study presents an attempt to explore an environmental dataset by a combination of visual mining and a Self-Organising Map. Spatial pre-processing and visual data mining methods were used in the third case study for emergency response data.
Contemporary system design methods involve user participation at all stages. These methods originated in the field of Human-Computer Interaction, but have been adapted for the geovisualisation issues related to spatial problem solving. Attention to user-centred design was present in all three case studies, but the principles were fully followed only for the third case study, where a usability assessment was performed using a combination of a formal evaluation and exploratory usability.
Bogorny, Vania. "Enhancing spatial association rule mining in geographic databases". reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2006. http://hdl.handle.net/10183/7841.
Texto completoThe association rule mining technique emerged with the objective to find novel, useful, and previously unknown associations from transactional databases, and a large amount of association rule mining algorithms have been proposed in the last decade. Their main drawback, which is a well known problem, is the generation of large amounts of frequent patterns and association rules. In geographic databases the problem of mining spatial association rules increases significantly. Besides the large amount of generated patterns and rules, many patterns are well known geographic domain associations, normally explicitly represented in geographic database schemas. The majority of existing algorithms do not warrant the elimination of all well known geographic dependences. The result is that the same associations represented in geographic database schemas are extracted by spatial association rule mining algorithms and presented to the user. The problem of mining spatial association rules from geographic databases requires at least three main steps: compute spatial relationships, generate frequent patterns, and extract association rules. The first step is the most effort demanding and time consuming task in the rule mining process, but has received little attention in the literature. The second and third steps have been considered the main problem in transactional association rule mining and have been addressed as two different problems: frequent pattern mining and association rule mining. Well known geographic dependences which generate well known patterns may appear in the three main steps of the spatial association rule mining process. Aiming to eliminate well known dependences and generate more interesting patterns, this thesis presents a framework with three main methods for mining frequent geographic patterns using knowledge constraints. Semantic knowledge is used to avoid the generation of patterns that are previously known as non-interesting. The first method reduces the input problem, and all well known dependences that can be eliminated without loosing information are removed in data preprocessing. The second method eliminates combinations of pairs of geographic objects with dependences, during the frequent set generation. A third method presents a new approach to generate non-redundant frequent sets, the maximal generalized frequent sets without dependences. This method reduces the number of frequent patterns very significantly, and by consequence, the number of association rules.
Weitl, Harms Sherri K. "Temporal association rule methodologies for geo-spatial decision support /". free to MU campus, to others for purchase, 2002. http://wwwlib.umi.com/cr/mo/fullcit?p3091989.
Texto completoLi, Xiaohui. "A Language and Visual Interface to Specify Complex Spatial Pattern Mining". Thesis, University of North Texas, 2006. https://digital.library.unt.edu/ark:/67531/metadc5408/.
Texto completoLee, Ho Young. "Diagnosing spatial variation patterns in manufacturing processes". Diss., Texas A&M University, 2003. http://hdl.handle.net/1969/122.
Texto completoGoler, Isil. "Pattern Extraction By Using Both Spatial And Temporal Features On Turkish Meteorological Data". Master's thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12612877/index.pdf.
Texto completoWrede, Fredrik. "An Explorative Parameter Sweep: Spatial-temporal Data Mining in Stochastic Reaction-diffusion Simulations". Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-280287.
Texto completoBATRA, SHALINI. "DISCOVERY OF CLUSTERS IN SPATIAL DATABASES". University of Cincinnati / OhioLINK, 2003. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1069701237.
Texto completoSchubert, Erich. "Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining". Diss., Ludwig-Maximilians-Universität München, 2013. http://nbn-resolving.de/urn:nbn:de:bvb:19-166938.
Texto completoKnowledge Discovery in Databases (KDD) is the process of extracting non-trivial patterns in large data bases, with the focus of extracting novel, potentially useful, statistically valid and understandable patterns. The process involves multiple phases including selection, preprocessing, evaluation and the analysis step which is known as Data Mining. One of the key techniques of Data Mining is outlier detection, that is the identification of observations that are unusual and seemingly inconsistent with the majority of the data set. Such rare observations can have various reasons: they can be measurement errors, unusually extreme (but valid) measurements, data corruption or even manipulated data. Over the previous years, various outlier detection algorithms have been proposed that often appear to be only slightly different than previous but ``clearly outperform'' the others in the experiments. A key focus of this thesis is to unify and modularize the various approaches into a common formalism to make the analysis of the actual differences easier, but at the same time increase the flexibility of the approaches by allowing the addition and replacement of modules to adapt the methods to different requirements and data types. To show the benefits of the modularized structure, (i) several existing algorithms are formalized within the new framework (ii) new modules are added that improve the robustness, efficiency, statistical validity and score usability and that can be combined with existing methods (iii) modules are modified to allow existing and new algorithms to run on other, often more complex data types including spatial, temporal and high-dimensional data spaces (iv) the combination of multiple algorithm instances into an ensemble method is discussed (v) the scalability to large data sets is improved using approximate as well as exact indexing. The starting point is the Local Outlier Factor (LOF) algorithm, which is extended with slight modifications to increase robustness and the usability of the produced scores. In order to get the same benefits for other methods, these methods are abstracted to a general framework for local outlier detection. By abstracting from a single vector space, other data types that involve spatial and temporal relationships can be analyzed. The use of subspace and correlation neighborhoods allows the algorithms to detect new kinds of outliers in arbitrarily oriented subspaces. Improvements in the score normalization bring back a statistic intuition of probabilities to the outlier scores that previously were only useful for ranking objects, while improved models also offer explanations of why an object was considered to be an outlier. Subsequently, for different modules found in the framework improved modules are presented that for example allow to run the same algorithms on significantly larger data sets -- in approximately linear complexity instead of quadratic complexity -- by accepting approximated neighborhoods at little loss in precision and effectiveness. Additionally, multiple algorithms with different intuitions can be run at the same time, and the results combined into an ensemble method that is able to detect outliers of different types. Finally, new outlier detection methods are constructed; customized for the specific problems of these real data sets. The new methods allow to obtain insightful results that could not be obtained with the existing methods. Since being constructed from the same building blocks, there however exists a strong and explicit connection to the previous approaches, and by using the indexing strategies introduced earlier, the algorithms can be executed efficiently even on large data sets.
Dos, Santos Raimundo Fonseca Jr. "Effective Methods of Semantic Analysis in Spatial Contexts". Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/49697.
Texto completoPh. D.
Icev, Aleksandar. "DARM distance-based association rule mining". Link to electronic thesis, 2003. http://www.wpi.edu/Pubs/ETD/Available/etd-0506103-132405.
Texto completoLeighty, Brian David. "Data Mining for Induction of Adjacency Grammars and Application to Terrain Pattern Recognition". NSUWorks, 2009. http://nsuworks.nova.edu/gscis_etd/212.
Texto completoSchmid, Klaus Arthur [Verfasser] y Matthias [Akademischer Betreuer] Renz. "Searching and mining in enriched geo-spatial data / Klaus Arthur Schmid ; Betreuer: Matthias Renz". München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2016. http://d-nb.info/1122435746/34.
Texto completoFranzke, Maximilian [Verfasser] y Matthias [Akademischer Betreuer] Renz. "Querying and mining heterogeneous spatial, social, and temporal data / Maximilian Franzke ; Betreuer: Matthias Renz". München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2019. http://d-nb.info/1190563630/34.
Texto completoYan, Ping. "SPATIAL-TEMPORAL DATA ANALYTICS AND CONSUMER SHOPPING BEHAVIOR MODELING". Diss., The University of Arizona, 2010. http://hdl.handle.net/10150/195232.
Texto completoDu, Xiaoxi. "Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets". [Kent, Ohio] : Kent State University, 2009. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=kent1239139458.
Texto completoTitle from PDF t.p. (viewed Nov. 13, 2009). Advisor: Ruoming Jin. Keywords: migration motif, trajectory mining, sequential pattern mining, time series clustering. Includes bibliographical references (p. 47-57).
Wang, Xiaofeng. "New Procedures for Data Mining and Measurement Error Models with Medical Imaging Applications". Case Western Reserve University School of Graduate Studies / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=case1121447716.
Texto completoKucuktunc, Onur. "Result Diversification on Spatial, Multidimensional, Opinion, and Bibliographic Data". The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1374148621.
Texto completoKhiali, Lynda. "Fouille de données à partir de séries temporelles d’images satellites". Thesis, Montpellier, 2018. http://www.theses.fr/2018MONTS046/document.
Texto completoNowadays, remotely sensed images constitute a rich source of information that can be leveraged to support several applications including risk prevention, land use planning, land cover classification and many other several tasks. In this thesis, Satellite Image Time Series (SITS) are analysed to depict the dynamic of natural and semi-natural habitats. The objective is to identify, organize and highlight the evolution patterns of these areas.We introduce an object-oriented method to analyse SITS that consider segmented satellites images. Firstly, we identify the evolution profiles of the objects in the time series. Then, we analyse these profiles using machine learning methods. To identify the evolution profiles, we explore all the objects to select a subset of objects (spatio-temporal entities/reference objects) to be tracked. The evolution of the selected spatio-temporal entities is described using evolution graphs.To analyse these evolution graphs, we introduced three contributions. The first contribution explores annual SITS. It analyses the evolution graphs using clustering algorithms, to identify similar evolutions among the spatio-temporal entities. In the second contribution, we perform a multi-annual cross-site analysis. We consider several study areas described by multi-annual SITS. We use the clustering algorithms to identify intra and inter-site similarities. In the third contribution, we introduce à semi-supervised method based on constrained clustering. We propose a method to select the constraints that will be used to guide the clustering and adapt the results to the user needs.Our contributions were evaluated on several study areas. The experimental results allow to pinpoint relevant landscape evolutions in each study sites. We also identify the common evolutions among the different sites. In addition, the constraint selection method proposed in the constrained clustering allows to identify relevant entities. Thus, the results obtained using the unsupervised learning were improved and adapted to meet the user needs
Schubert, Erich [Verfasser] y Hans-Peter [Akademischer Betreuer] Kriegel. "Generalized and efficient outlier detection for spatial, temporal, and high-dimensional data mining / Erich Schubert. Betreuer: Hans-Peter Kriegel". München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2013. http://d-nb.info/1048522377/34.
Texto completoDaniel, Guilherme Priólli [UNESP]. "Otimização de algoritmos de agrupamento espacial baseado em densidade aplicados em grandes conjuntos de dados". Universidade Estadual Paulista (UNESP), 2016. http://hdl.handle.net/11449/143832.
Texto completoApproved for entry into archive by Juliano Benedito Ferreira (julianoferreira@reitoria.unesp.br) on 2016-09-09T17:54:56Z (GMT) No. of bitstreams: 1 daniel_gp_me_sjrp.pdf: 2456534 bytes, checksum: 4d2279141f7c034de1e4e4e261805db8 (MD5)
Made available in DSpace on 2016-09-09T17:54:56Z (GMT). No. of bitstreams: 1 daniel_gp_me_sjrp.pdf: 2456534 bytes, checksum: 4d2279141f7c034de1e4e4e261805db8 (MD5) Previous issue date: 2016-08-12
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
A quantidade de dados gerenciados por serviços Web de grande escala tem crescido significantemente e passaram a ser chamados de Big Data. Esses conjuntos de dados podem ser definidos como um grande volume de dados complexos provenientes de múltiplas fontes que ultrapassam a capacidade de armazenamento e processamento dos computadores atuais. Dentro desses conjuntos, estima-se que 80% dos dados possuem associação com alguma posição espacial. Os dados espaciais são mais complexos e demandam mais tempo de processamento que os dados alfanuméricos. Nesse sentido, as técnicas de MapReduce e sua implementação têm sido utilizadas a fim de retornar resultados em tempo hábil com a paralelização dos algoritmos de prospecção de dados. Portanto, o presente trabalho propõe dois algoritmos de agrupamento espacial baseado em densidade: o VDBSCAN-MR e o OVDBSCAN-MR. Ambos os algoritmos utilizam técnicas de processamento distribuído e escalável baseadas no modelo de programação MapReduce com intuito de otimizar o desempenho e permitir a análise em conjuntos Big Data. Por meio dos experimentos realizados foi possível verificar que os algoritmos desenvolvidos apresentaram melhor qualidade nos agrupamentos encontrados em comparação com os algoritmos tomados como base. Além disso, o VDBSCAN-MR obteve um melhor desempenho que o algoritmo sequencial e suportou a aplicação em grandes conjuntos de dados espaciais.
The amount of data managed by large-scale Web services has increased significantly and it arise to the status of Big Data. These data sets can be defined as a large volume of complex data from multiple data sources exceeding the storage and processing capacity of current computers. In such data sets, about 80% of the data is associated with some spatial position. Spatial data is even more complex and require more processing time than what would be required for alphanumeric data. In that sense, MapReduce techniques and their implementation have returned results timely with parallelization of data mining algorithms and could apply for Big Data sets. Therefore, this work develops two density-based spatial clustering algorithms: VDBSCAN-MR and OVDBSCAN-MR. Both algorithms use distributed and scalable processing techniques based on the MapReduce programming model in order to optimize performance and enable Big Data analysis. Throughout experimentation, we observed that the developed algorithms have better quality clusters compared to the base algorithms. Furthermore, VDBSCAN-MR achieved a better performance than the original sequential algorithm and it supported the application on large spatial data sets.
Mendez, Chaves Diego. "A Framework for Participatory Sensing Systems". Scholar Commons, 2012. http://scholarcommons.usf.edu/etd/4135.
Texto completoFu, Kaiqun. "Spatiotemporal Event Forecasting and Analysis with Ubiquitous Urban Sensors". Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/104165.
Texto completoDoctor of Philosophy
The ubiquitously deployed urban sensors such as traffic speed meters, street-view cameras, and even smartphones in everybody's pockets are generating terabytes of data every hour. How do we refine the valuable intelligence out of such explosions of urban data and information became one of the profitable questions in the field of data mining and urban computing. In this dissertation, four innovative applications are proposed to solve real-world problems with big data of the urban sensors. In addition, the foreseeable ethical vulnerabilities in the research fields of urban computing and event predictions are addressed. The first work explores the connection between urban perception and crime inferences. StreetNet is proposed to learn crime rankings from street view images. This work presents the design of a street view images retrieval algorithm to improve the representation of urban perception. A data-driven, spatiotemporal algorithm is proposed to find unbiased label mappings between the street view images and the crime ranking records. The second work proposes a traffic incident duration prediction model that simultaneously predicts the impact of the traffic incidents and identifies the critical groups of temporal features via a multi-task learning framework. Such functionality provided by this model is helpful for the transportation operators and first responders to judge the influences of traffic incidents. In the third work, a social media-based traffic status monitoring system is established. The system is initiated by a transportation-related keyword generation process. A state-of-the-art tweets summarization algorithm is designed to eliminate the redundant tweets information. In addition, we show that the proposed tweets query expansion algorithm outperforms the previous methods. The fourth work aims to investigate the viability of an automatic multiclass cyberbullying detection model that is able to classify whether a cyberbully is targeting a victim's age, ethnicity, gender, religion, or other quality. This work represents a step forward for establishing an active anti-cyberbullying presence in social media and a step forward towards a future without cyberbullying. Finally, a discussion of the ethical issues in the urban computing community is addressed. This work seeks to identify ethical vulnerabilities from three primary research directions of urban computing: urban safety analysis, urban transportation analysis, and social media analysis for urban events. Visions for future improvements in the perspective of ethics are pointed out.
Zhou, Guoqing. "Co-Location Decision Tree for Enhancing Decision-Making of Pavement Maintenance and Rehabilitation". Diss., Virginia Tech, 2011. http://hdl.handle.net/10919/26059.
Texto completoPh. D.
Ågren, Ola. "Finding, extracting and exploiting structure in text and hypertext". Doctoral thesis, Umeå universitet, Institutionen för datavetenskap, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-22352.
Texto completoInformationsutvinning (som ofta kallas data mining även på svenska) är ett forskningsområde som hela tiden utvecklas. Det handlar om att använda datorer för att hitta mönster i stora mängder data, alternativt förutsäga framtida data utifrån redan tillgänglig data. Eftersom det samtidigt produceras mer och mer data varje år ställer detta högre och högre krav på effektiviteten hos de algoritmer som används för att hitta eller använda informationen inom rimlig tid. Denna avhandling handlar om att extrahera information från semi-strukturerad data, att hitta strukturer i stora diskreta datamängder och att på ett effektivt sätt rangordna webbsidor utifrån ett ämnesbaserat perspektiv. Den informationsextraktion som beskrivs handlar om stöd för att hålla både dokumentationen och källkoden uppdaterad samtidigt. Vår lösning på detta problem är att låta delar av dokumentationen (främst algoritmbeskrivningen) ligga som blockkommentarer i källkoden och extrahera dessa automatiskt med ett verktyg. De strukturer som hittas av våra algoritmer för strukturextraktion är i form av underordnanden, exempelvis att ett visst nyckelord är mer generellt än ett annat. Dessa samband kan utnyttjas för att skapa större strukturer i form av hierarkier eller riktade grafer, eftersom underordnandena är transitiva. Det verktyg som vi har tagit fram har främst använts för att skapa indata till ett informationsutvinningssystem samt för att kunna visualisera indatan. Huvuddelen av den forskning som beskrivs i denna avhandling har dock handlat om att kunna rangordna webbsidor utifrån både deras innehåll och länkarna som finns mellan dem. Vi har skapat ett antal algoritmer och visat hur de beter sig i jämförelse med andra algoritmer som används idag. Dessa jämförelser har huvudsakligen handlat om konvergenshastighet, algoritmernas stabilitet givet osäker data och slutligen hur relevant algoritmernas svarsmängder har ansetts vara utifrån användarnas perspektiv. Forskningen har varit inriktad på effektiva algoritmer för att hämta in och hantera stora datamängder med diskreta eller textbaserade data. I avhandlingen presenterar vi även ett förslag till ett system av verktyg som arbetar tillsammans på en databas bestående av “fingeravtryck” och annan meta-data om de saker som indexerats i databasen. Denna data kan sedan användas av diverse algoritmer för att utöka värdet hos det som finns i databasen eller för att effektivt kunna hitta rätt information.
AlgExt, CHiC, ProT
Cavazzi, Stefano. "Spatial scale analysis of landscape processes for digital soil mapping in Ireland". Thesis, Cranfield University, 2013. http://dspace.lib.cranfield.ac.uk/handle/1826/8591.
Texto completoJguirim, Ines. "Modélisation et génération d'itinéraires contextuels d'activités urbaines dans la ville". Thesis, Brest, 2016. http://www.theses.fr/2016BRES0074/document.
Texto completoThe city is an urban aggregation allowing to offer diverse services to his city-dwellers. She establishes a complex system which depends on several social and economic factors. The configuration of the space influences in a important way the accessibility to the various features of the city. The spatial analysis of the urban structure is realized on cities to study the characteristics of the space and be able to estimate its functional potential. The aim of the thesis is to propose an approach to spatial analysis which takes into account the various structural and semantic aspects of the city. A model based on the graphs was proposed to represent the multimodal transport network of the city which guarantees the accessibility to the various points of interest. Super-networks were used to integrate the possibility of an intermodal transfer into the model of transport by links of interdependence between the sub-graphs associated to the various means of transportation. The temporal aspect was represented in the model by attributes specifying the temporal constraints characterizing the itinerary of every node and every edge such as the time of exploration, the waiting time and the time required for the road penalties. The functional aspect is introduced by the concept of activity. We proposed a conceptual model which aims to model the various contextual elements which can affect the planning and the execution of the urban activities such as the spatiotemporal frame and the profile of the user. This model was enriched by knowledge management which aims to represent information about individual behaviors. The extracted knowledge are represented by a management system of rules allowing the contextual planning of the activity
Remes, J. (Jukka). "Method evaluations in spatial exploratory analyses of resting-state functional magnetic resonance imaging data". Doctoral thesis, Oulun yliopisto, 2013. http://urn.fi/urn:isbn:9789526202228.
Texto completoTiivistelmä Aivoista toiminnallisella magneettikuvantamisella (engl. functional magnetic resonance imaging, fMRI) lepotilassa tehdyt mittaukset ovat saaneet vakiintuneen aseman spontaanin aivotoiminnan tutkimuksessa. Lepotilan fMRI:n tulokset saadaan usein käyttämällä exploratiivisia menetelmiä, kuten spatiaalista itsenäisten komponenttien analyysia (engl. spatial independent component analysis, sICA). Näitä menetelmiä ja niiden ohjelmistototeutuksia evaluoidaan harvoin kattavasti tai erityisesti lepotilan fMRI:n kannalta. Ohjelmistojen luotetaan toimivan menetelmäkuvausten mukaisesti. Monia menetelmiä ja parametreja käytetään testidatan puuttumisesta huolimatta, ja myös menetelmien taustalla olevien mallien pätevyys on edelleen epäselvä asia. Eksploratiivisten lepotilan fMRI-datan analyysien laadun varmistamiseksi tarvittaisiin huomattavasti nykyistä suurempi määrä evaluaatioita. Tämä väitöskirja tutki sICA-menetelmien ja -ohjelmistojen soveltuvuutta lepotilan fMRI-tutkimuksiin. Kokemuksien perusteella luotiin yleisiä ohjenuoria helpottamaan tulevaisuuden menetelmäevaluaatioita. Lisäksi väitöskirjassa kehitettiin uusi monivertailukorjausmenetelmä, Maxmad, evaluaatiotulosten tilastolliseen korjaukseen. Tunnetun sICA-ohjelmiston, FSL Melodicin, lähdekoodi analysoitiin suhteessa julkaistuihin menetelmäkuvauksiin. Analyysissa ilmeni aiemmin raportoimattomia ja evaluoimattomia menetelmäyksityiskohtia, mikä tarkoittaa, ettei kirjallisuudessa olevien menetelmäkuvausten ja niiden ohjelmistototeutusten välille pitäisi automaattisesti olettaa vastaavuutta. Menetelmätoteutukset pitäisi katselmoida riippumattomasti. Väitöskirjan kokeellisena panoksena parannettiin liukuvassa ikkunassa suoritettavan sICA:n uskottavuutta varmistamalla sICA:n esikäsittelyjen oikeellisuus. Lisäksi väitöskirjassa näytettiin, että aiempien sICA-tulosten tarkkuus ei ole kärsinyt, vaikka niiden estimoinnissa ei ole käytetty toistettavuustyökaluja, kuten Icasso-ohjelmistoa. Väitöskirjan tulokset kyseenalaistavat myös perinteisen sICA-mallin, minkä vuoksi tulisi harkita siitä poikkeavia lähtökohtia lepotilan fMRI-datan analyysiin. Evaluaatioiden helpottamiseksi kehitetyt ohjeet sisältävät seuraavat periaatteet: 1) avoin ohjelmistokehitys (parantunut virheiden havaitseminen), 2) modulaarinen ohjelmistosuunnittelu (nykyistä helpommin toteutettavat evaluaatiot), 3) datatyyppikohtaiset evaluaatiot (parantunut validiteetti) ja 4) parametriavaruuden laaja kattavuus evaluaatioissa (parantunut uskottavuus). Ehdotettu Maxmad-monivertailukorjaus tarjoaa ratkaisuvaihtoehdon laajojen evaluaatioiden tilastollisiin haasteisiin. Jotta lepotilan fMRI:ssä käytettävien exploratiivisten menetelmien uskottavuus paranisi, väitöskirjassa ehdotetaan laaja-alaista yhteistyötä menetelmien evaluoimiseksi
Zhang, Weimin. "Topics in living cell miultiphoton laser scanning microscopy (MPLSM) image analysis". Texas A&M University, 2006. http://hdl.handle.net/1969.1/4412.
Texto completoEvans, Ben Richard. "Data-driven prediction of saltmarsh morphodynamics". Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276823.
Texto completoPrananto, Agnes Kristina. "The use of remotely sensed data to analyse spatial and temporal trends in vegetation patchiness within rehabilitated bauxite mines in the Darling Range, W.A. /". Connect to this title, 2005. http://theses.library.uwa.edu.au/adt-WU2006.0012.
Texto completoDa, Silva Sébastien. "Fouille de données spatiales et modélisation de linéaires de paysages agricoles". Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0156/document.
Texto completoThis thesis is part of a partnership between INRA and INRIA in the field of knowledge extraction from spatial databases. The study focuses on the characterization and simulation of agricultural landscapes. More specifically, we focus on linears that structure the agricultural landscape, such as roads, irrigation ditches and hedgerows. Our goal is to model the spatial distribution of hedgerows because of their role in many ecological and environmental processes. We more specifically study how to characterize the spatial structure of hedgerows in two contrasting agricultural landscapes, one located in south-Eastern France (mainly composed of orchards) and the second in Brittany (western France, \emph{bocage}-Type). We determine if the spatial distribution of hedgerows is structured by the position of the more perennial linear landscape features, such as roads and ditches, or not. In such a case, we also detect the circumstances under which this spatial distribution is structured and the scale of these structures. The implementation of the process of Knowledge Discovery in Databases (KDD) is comprised of different preprocessing steps and data mining algorithms which combine mathematical and computational methods. The first part of the thesis focuses on the creation of a statistical spatial index, based on a geometric neighborhood concept and allowing the characterization of structures of hedgerows. Spatial index allows to describe the structures of hedgerows in the landscape. The results show that hedgerows depend on more permanent linear elements at short distances, and that their neighborhood is uniform beyond 150 meters. In addition different neighborhood structures have been identified depending on the orientation of hedgerows in the South-East of France but not in Brittany. The second part of the thesis explores the potential of coupling linearization methods with Markov methods. The linearization methods are based on the use of alternative Hilbert curves: Hilbert adaptive paths. The linearized spatial data thus constructed were then treated with Markov methods. These methods have the advantage of being able to serve both for the machine learning and for the generation of new data, for example in the context of the simulation of a landscape. The results show that the combination of these methods for learning and automatic generation of hedgerows captures some characteristics of the different study landscapes. The first simulations are encouraging despite the need for post-Processing. Finally, this work has enabled the creation of a spatial data mining method based on different tools that support all stages of a classic KDD, from the selection of data to the visualization of results. Furthermore, this method was constructed in such a way that it can also be used for data generation, a component necessary for the simulation of landscapes