Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: BASED DATA.

Rozprawy doktorskie na temat „BASED DATA”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „BASED DATA”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.

1

Niggemann, Oliver. "Visual data mining of graph based data". [S.l. : s.n.], 2001. http://deposit.ddb.de/cgi-bin/dokserv?idn=962400505.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Li, Liangchun. "Web-based data visualization for data mining". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp03/MQ35845.pdf.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Albarakati, Rayan. "Density Based Data Clustering". CSUSB ScholarWorks, 2015. https://scholarworks.lib.csusb.edu/etd/134.

Pełny tekst źródła
Streszczenie:
Data clustering is a data analysis technique that groups data based on a measure of similarity. When data is well clustered the similarities between the objects in the same group are high, while the similarities between objects in different groups are low. The data clustering technique is widely applied in a variety of areas such as bioinformatics, image segmentation and market research. This project conducted an in-depth study on data clustering with focus on density-based clustering methods. The latest density-based (CFSFDP) algorithm is based on the idea that cluster centers are characterized by a higher density than their neighbors and by a relatively larger distance from points with higher densities. This method has been examined, experimented, and improved. These methods (KNN-based, Gaussian Kernel-based and Iterative Gaussian Kernel-based) are applied in this project to improve (CFSFDP) density-based clustering. The methods are applied to four milestone datasets and the results are analyzed and compared.
Style APA, Harvard, Vancouver, ISO itp.
4

Garingo, Gary D. "JAVA based data connectivity". Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1997. http://handle.dtic.mil/100.2/ADA342181.

Pełny tekst źródła
Streszczenie:
Thesis (M.S. in Software Engineering) Naval Postgraduate School, September 1997.
"September 1997." Thesis advisor(s): LuQi, V. Berzins. Includes bibliographical references (p. 63). Also available online.
Style APA, Harvard, Vancouver, ISO itp.
5

Young, G. A. "Data-based statistical methods". Thesis, University of Cambridge, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.383307.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Jäkel, Tobias. "Role-based Data Management". Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-224416.

Pełny tekst źródła
Streszczenie:
Database systems build an integral component of today’s software systems and as such they are the central point for storing and sharing a software system’s data while ensuring global data consistency at the same time. Introducing the primitives of roles and their accompanied metatype distinction in modeling and programming languages, results in a novel paradigm of designing, extending, and programming modern software systems. In detail, roles as modeling concept enable a separation of concerns within an entity. Along with its rigid core, an entity may acquire various roles in different contexts during its lifetime and thus, adapts its behavior and structure dynamically during runtime. Unfortunately, database systems, as important component and global consistency provider of such systems, do not keep pace with this trend. The absence of a metatype distinction, in terms of an entity’s separation of concerns, in the database system results in various problems for the software system in general, for the application developers, and finally for the database system itself. In case of relational database systems, these problems are concentrated under the term role-relational impedance mismatch. In particular, the whole software system is designed by using different semantics on various layers. In case of role-based software systems in combination with relational database systems this gap in semantics between applications and the database system increases dramatically. Consequently, the database system cannot directly represent the richer semantics of roles as well as the accompanied consistency constraints. These constraints have to be ensured by the applications and the database system loses its single point of truth characteristic in the software system. As the applications are in charge of guaranteeing global consistency, their development requires more effort in data management. Moreover, the software system’s data management is distributed over several layers, which results in an unstructured software system architecture. To overcome the role-relational impedance mismatch and bring the database system back in its rightful position as single point of truth in a software system, this thesis introduces the novel and tripartite RSQL approach. It combines a novel database model that represents the metatype distinction as first class citizen in a database system, an adapted query language on the database model’s basis, and finally a proper result representation. Precisely, RSQL’s logical database model introduces Dynamic Data Types, to directly represent the separation of concerns within an entity type on the schema level. On the instance level, the database model defines the notion of a Dynamic Tuple that combines an entity with the notion of roles and thus, allows for dynamic structure adaptations during runtime without changing an entity’s overall type. These definitions build the main data structures on which the database system operates. Moreover, formal operators connecting the query language statements with the database model data structures, complete the database model. The query language, as external database system interface, features an individual data definition, data manipulation, and data query language. Their statements directly represent the metatype distinction to address Dynamic Data Types and Dynamic Tuples, respectively. As a consequence of the novel data structures, the query processing of Dynamic Tuples is completely redesigned. As last piece for a complete database integration of a role-based notion and its accompanied metatype distinction, we specify the RSQL Result Net as result representation. It provides a novel result structure and features functionalities to navigate through query results. Finally, we evaluate all three RSQL components in comparison to a relational database system. This assessment clearly demonstrates the benefits of the roles concept’s full database integration.
Style APA, Harvard, Vancouver, ISO itp.
7

Ji, Yongnan. "Data-driven fMRI data analysis based on parcellation". Thesis, University of Nottingham, 2001. http://eprints.nottingham.ac.uk/12645/.

Pełny tekst źródła
Streszczenie:
Functional Magnetic Resonance Imaging (fMRI) is one of the most popular neuroimaging methods for investigating the activity of the human brain during cognitive tasks. As with many other neuroiroaging tools, the group analysis of fMRI data often requires a transformation of the individual datasets to a common stereotaxic space, where the different brains have a similar global shape and size. However, the local inaccuracy of this procedure gives rise to a series of issues including a lack of true anatomical correspondence and a loss of subject specific activations. Inter-subject parcellation of fMRI data has been proposed as a means to alleviate these problems. Within this frame, the inter-subject correspondence is achieved by isolating homologous functional parcels across individuals, rather than by matching voxels coordinates within a stereotaxic space. However, the large majority of parcellation methods still suffer from a number of shortcomings owing to their dependence on a general linear model. Indeed, for all its appeal, a GLM-based parcellation approach introduces its own biases in the form of a priori knowledge about such matters as the shape of the Hemodynamic Response Function (HRF) and taskrelated signal changes. In this thesis, we propose a model-free data-driven parcellation approach to singleand multi-subject parcellation. By modelling brain activation without an relying on an a priori model, parcellation is optimized for each individual subject. In order to establish correspondences of parcels across different subjects, we cast this problem as a multipartite graph partitioning task. Parcels are considered as the vertices of a weighted complete multipartite graph. Cross subject parcel matching becomes equivalent to partitioning this graph into disjoint cliques with one and only one parcel from each subject in each clique. In order to solve this NP-hard problem, we present three methods: the OBSA algorithm, a method with quadratic programming and an intuitive approach. We also introduce two quantitative measures of the quality of parcellation results. We apply our framework to two fMRI data sets and show that both our single- and multi-subject parcellation techniques rival or outperform model-based methods in terms of parcellation accuracy.
Style APA, Harvard, Vancouver, ISO itp.
8

Wang, Yi. "Data Management and Data Processing Support on Array-Based Scientific Data". The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1436157356.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Hong, Yili. "Reliability prediction based on complicated data and dynamic data". [Ames, Iowa : Iowa State University], 2009.

Znajdź pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
10

Chepetan, Adrian. "Microcontroller based Data Acquisition System". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/MQ62200.pdf.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
11

Gaspar, John M. "Denoising amplicon-based metagenomic data". Thesis, University of New Hampshire, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3581214.

Pełny tekst źródła
Streszczenie:

Reducing the effects of sequencing errors and PCR artifacts has emerged as an essential component in amplicon-based metagenomic studies. Denoising algorithms have been written that can reduce error rates in mock community data, in which the true sequences are known, but they were designed to be used in studies of real communities. To evaluate the outcome of the denoising process, we developed methods that do not rely on a priori knowledge of the correct sequences, and we applied these methods to a real-world dataset. We found that the denoising algorithms had substantial negative side-effects on the sequence data. For example, in the most widely used denoising pipeline, AmpliconNoise, the algorithm that was designed to remove pyrosequencing errors changed the reads in a manner inconsistent with the known spectrum of these errors, until one of the parameters was increased substantially from its default value.

With these shortcomings in mind, we developed a novel denoising program, FlowClus. FlowClus uses a systematic approach to filter and denoise reads efficiently. When denoising real datasets, FlowClus provides feedback about the process that can be used as the basis to adjust the parameters of the algorithm to suit the particular dataset. FlowClus produced a lower error rate compared to other denoising algorithms when analyzing a mock community dataset, while retaining significantly more sequence information. Among its other attributes, FlowClus can analyze longer reads being generated from current protocols and irregular flow orders. It has processed a full plate (1.5 million reads) in less than four hours; using its more efficient (but less precise) trie analysis option, this time was further reduced, to less than seven minutes.

Style APA, Harvard, Vancouver, ISO itp.
12

Vinjarapu, Saranya S. "GPU Based Scattered Data Modeling". University of Akron / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=akron1335297259.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
13

Krisnadhi, Adila Alfa. "Ontology Pattern-Based Data Integration". Wright State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=wright1453177798.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
14

Islam, Naveed. "Cryptography based Visual Data Protection". Thesis, Montpellier 2, 2011. http://www.theses.fr/2011MON20178/document.

Pełny tekst źródła
Streszczenie:
La transmission de données multimédia sur les réseaux sécurisés a une croissance exponentielle grâce aux progrès scientifique dans les technologies de l'information et de la communication. La sécurité des données dans certaines applications comme le stockage sécurisé, l'authentification, la protection des droits d'auteurs, la communication militaire ou la visioconférence confidentielles, nécessitent de nouvelles stratégies en matière de transmission sécurisée. Deux techniques sont couramment utilisées pour la transmission sécurisée de données visuelles, à savoir : la cryptographie et la stéganographie. La cryptographie sécurise les données en utilisant des clés secrètes afin de rendre les données illisibles, la stéganographie, elle, vise à insérer des données cruciales dans des signaux porteurs anodins.De plus, pour la confiance mutuelle et les systèmes distribués, le partage sécurisé de ressources est souvent une garantie suffisante pour les applications de communication. L'objectif principal de cette thèse est de réaliser une protection des données visuelles, en particulier les images numériques, par le biais des techniques modernes de cryptographie. Dans ce contexte, deux objectifs de recherche ont été développés durant ces travaux de thèse.La première partie de notre travail se concentre sur la sécurité des images numériques dans un environnement partagé. Ensuite, la deuxième partie porte sur l'intégrité des données visuelles pendant une transmission sécurisée.Nous avons proposé un nouveau schéma de partage des images qui exploite les propriétés d'addition et de multiplication homomorphique de deux crypto systèmes à clé publique largement utilisés : les algorithmes RSA et Paillier. Dans les schémas traditionnels de partage sécurisé, le ``dealer'' partitionne le secret en parties et le distribue à chacun des autres acteurs. Ainsi, aucun des acteurs impliqués ne participe à la création du partage sécurisé, mais il est toujours possible que le ``dealer'' transmette des données malveillantes. Au contraire, l'approche proposée utilise le système de partage de secret d'une manière qui limite l'influence du ‘‘dealer'' sur le protocole en permettant à chaque acteur de participer.La deuxième partie de ces travaux de thèse met l'accent sur l'intégrité des données visuelles lors de la transmission. L'intégrité des données signifie que les données gardent leurs structures complètes au cours d'une opération numérique comme le stockage, le transfert ou la récupération. Le changement d'un seul bit de données cryptées peut avoir un impact catastrophique sur les données décryptées. Nous abordons le problème de correction d'erreurs dans les images cryptées en utilisant le chiffrement à clé symétrique AES (Advanced Encryptions Standard) suivant différents modes. Trois mesures sont proposées afin d'exploiter les statistiques locales des données visuelles et l'algorithme de chiffrement, dans l'objectif de corriger les erreurs efficacement
Due to the advancements in the information and communication technologies, the transmission of multimedia data over secure or insecure communication channels has increased exponentially. The security of data in applications like safe storage, authentications, copyright protection,remote military image communication or confidential video-conferencing require new strategies for secure transmission. Two techniques are commonly used for the secure transmission of visual data, i.e. cryptography and steganography. Cryptography achieves security by using secret keysto make the data illegible while steganography aims to hide the data in some innocent carrier signal. For shared trust and distributed environment, secret sharing schemes provide sufficient security in various communication applications. The principal objective of this thesis is to achieveprotection of visual data especially images through modern cryptographic techniques. In this context, the focus of the work in perspective, is twofolded. The first part of our work focuses on the security of image data in shared environment while the second part focuses on the integrity ofimage data in the encrypted domain during transmission.We proposed a new sharing scheme for images which exploits the additive and multiplicative homomorphic properties of two well known public key cryptosystems, namely, the RSA and the Paillier. In traditional secret sharing schemes, the dealer partitions the secret into shares and distributethe shares to each of the player. Thus, none of the involved players participate in the creation of the shared secret and there is always a possibilitythat the dealer can cheat some player. On the contrary, the proposed approach employs the secret sharing scheme in a way that limits the influence of the dealer over the protocol by allowing each player to participate. The second part of our thesis emphasizes on the integrity of visual data during transmission. Data integrity means that the data have its complete structure during any operation like storage, transfer or retrieval. A single bit change in encrypted data can have catastrophic impact over the decrypted data. We address the problem of error correction in images encrypted using symmetric key cryptosystem of the Advanced Encryption Standard (AES) algorithm. Three methods are proposed to exploit the local statistics of the visual data and the encryption algorithm to successfully correct the errors
Style APA, Harvard, Vancouver, ISO itp.
15

Jung, Uk. "Wavelet-based Data Reduction and Mining for Multiple Functional Data". Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5084.

Pełny tekst źródła
Streszczenie:
Advance technology such as various types of automatic data acquisitions, management, and networking systems has created a tremendous capability for managers to access valuable production information to improve their operation quality and efficiency. Signal processing and data mining techniques are more popular than ever in many fields including intelligent manufacturing. As data sets increase in size, their exploration, manipulation, and analysis become more complicated and resource consuming. Timely synthesized information such as functional data is needed for product design, process trouble-shooting, quality/efficiency improvement and resource allocation decisions. A major obstacle in those intelligent manufacturing system is that tools for processing a large volume of information coming from numerous stages on manufacturing operations are not available. Thus, the underlying theme of this thesis is to reduce the size of data in a mathematical rigorous framework, and apply existing or new procedures to the reduced-size data for various decision-making purposes. This thesis, first, proposes {it Wavelet-based Random-effect Model} which can generate multiple functional data signals which have wide fluctuations(between-signal variations) in the time domain. The random-effect wavelet atom position in the model has {it locally focused impact} which can be distinguished from other traditional random-effect models in biological field. For the data-size reduction, in order to deal with heterogeneously selected wavelet coefficients for different single curves, this thesis introduces the newly-defined {it Wavelet Vertical Energy} metric of multiple curves and utilizes it for the efficient data reduction method. The newly proposed method in this thesis will select important positions for the whole set of multiple curves by comparison between every vertical energy metrics and a threshold ({it Vertical Energy Threshold; VET}) which will be optimally decided based on an objective function. The objective function balances the reconstruction error against a data reduction ratio. Based on class membership information of each signal obtained, this thesis proposes the {it Vertical Group-Wise Threshold} method to increase the discriminative capability of the reduced-size data so that the reduced data set retains salient differences between classes as much as possible. A real-life example (Tonnage data) shows our proposed method is promising.
Style APA, Harvard, Vancouver, ISO itp.
16

Jeong, Myong-Kee. "Wavelet-Based Methodology in Data Mining for Complicated Functional Data". Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5148.

Pełny tekst źródła
Streszczenie:
To handle potentially large size and complicated nonstationary functional data, we present the wavelet-based methodology in data mining for process monitoring and fault classification. Since traditional wavelet shrinkage methods for data de-noising are ineffective for the more demanding data reduction goals, this thesis presents data reduction methods based on discrete wavelet transform. Our new methods minimize objective functions to balance the tradeoff between data reduction and modeling accuracy. Several evaluation studies with four popular testing curves used in the literature and with two real-life data sets demonstrate the superiority of the proposed methods to engineering data compression and statistical data de-noising methods that are currently used to achieve data reduction goals. Further experimentation in applying a classification tree-based data mining procedure to the reduced-size data to identify process fault classes also demonstrates the excellence of the proposed methods. In this application the proposed methods, compared with analysis of original large-size data, result in lower misclassification rates with much better computational efficiency. This thesis extends the scalogram's ability for handling noisy and possibly massive data which show time-shifted patterns. The proposed thresholded scalogram is built on the fast wavelet transform, which can effectively and efficiently capture non-stationary changes in data patterns. Finally, we present a SPC procedure that adaptively determines which wavelet coefficients will be monitored, based on their shift information, which is estimated from process data. By adaptively monitoring the process, we can improve the performance of the control charts for functional data. Using a simulation study, we compare the performance of some of the recommended approaches.
Style APA, Harvard, Vancouver, ISO itp.
17

Tutak, Wes A. "Error identification and data recovery in MISR-based data compaction". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp05/mq22683.pdf.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
18

Liu, Xiaodong. "Web-based access to a data warehouse of administrative data". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0019/MQ48278.pdf.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
19

Kilpatrick, Stephen, Galen Rasche, Chris Cunningham, Myron Moodie i Ben Abbott. "REORDERING PACKET BASED DATA IN REAL-TIME DATA ACQUISITION SYSTEMS". International Foundation for Telemetering, 2007. http://hdl.handle.net/10150/604571.

Pełny tekst źródła
Streszczenie:
ITC/USA 2007 Conference Proceedings / The Forty-Third Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2007 / Riviera Hotel & Convention Center, Las Vegas, Nevada
Ubiquitous internet protocol (IP) hardware has reached performance and capability levels that allow its use in data collection and real-time processing applications. Recent development experience with IP-based airborne data acquisition systems has shown that the open, pre-existing IP tools, standards, and capabilities support this form of distribution and sharing of data quite nicely, especially when combined with IP multicast. Unfortunately, the packet based nature of our approach also posed some problems that required special handling to achieve performance requirements. We have developed methods and algorithms for the filtering, selecting, and retiming problems associated with packet-based systems and present our approach in this paper.
Style APA, Harvard, Vancouver, ISO itp.
20

Tziatzios, Achilleas. "Data mining of range-based classification rules for data characterization". Thesis, Cardiff University, 2014. http://orca.cf.ac.uk/65902/.

Pełny tekst źródła
Streszczenie:
Advances in data gathering have led to the creation of very large collections across different fields like industrial site sensor measurements or the account statuses of a financial institution's clients. The ability to learn classification rules, rules that associate specific attribute values with a specific class label, from this data is important and useful in a range of applications. While many methods to facilitate this task have been proposed, existing work has focused on categorical datasets and very few solutions that can derive classification rules of associated continuous ranges (numerical intervals) have been developed. Furthermore, these solutions have solely relied in classification performance as a means of evaluation and therefore focus on the mining of mutually exclusive classification rules and the correct prediction of the most dominant class values. As a result existing solutions demonstrate only limited utility when applied for data characterization tasks. This thesis proposes a method that derives range-based classification rules from numerical data inspired by classification association rule mining. The presented method searches for associated numerical ranges that have a class value as their consequent and meet a set of user defined criteria. A new interestingness measure is proposed for evaluating the density of range-based rules and four heuristic based approaches are presented for targeting different sets of rules. Extensive experiments demonstrate the effectiveness of the new algorithm for classification tasks when compared to existing solutions and its utility as a solution for data characterization.
Style APA, Harvard, Vancouver, ISO itp.
21

Hric, Jan. "EVENT BASED PREDICTIVE FAILURE DATA ANALYSIS OF RAILWAY OPERATIONAL DATA". Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-48709.

Pełny tekst źródła
Streszczenie:
Predictive maintenance plays a major role in operational cost reduction in several industries and the railway industry is no exception. Predictive maintenance relies on real time data to predict and diagnose technical failures. Sensor data is usually utilized for this purpose, however it might not always be available. Events data are a potential substitute as a source of information which could be used to diagnose and predict failures. This thesis investigates the use of events data in the railway industry for failure diagnosis and prediction. The proposed approach turns this problem into a sequence classification task, where the data is transformed into a set of sequences which are used to train the machine learning algorithm. Long Short-Term Memory neural network is used as it has been successfully used in the past for sequence classification tasks. The prediction model is able to achieve high training accuracy, but it is at the moment unable to generalize the patterns and apply them on new sets of data. At the end of the thesis, the approach is evaluated and future steps are proposed to improve failure diagnosis and prediction.
Style APA, Harvard, Vancouver, ISO itp.
22

Zheng, Yao. "Privacy Preservation for Cloud-Based Data Sharing and Data Analytics". Diss., Virginia Tech, 2016. http://hdl.handle.net/10919/73796.

Pełny tekst źródła
Streszczenie:
Data privacy is a globally recognized human right for individuals to control the access to their personal information, and bar the negative consequences from the use of this information. As communication technologies progress, the means to protect data privacy must also evolve to address new challenges come into view. Our research goal in this dissertation is to develop privacy protection frameworks and techniques suitable for the emerging cloud-based data services, in particular privacy-preserving algorithms and protocols for the cloud-based data sharing and data analytics services. Cloud computing has enabled users to store, process, and communicate their personal information through third-party services. It has also raised privacy issues regarding losing control over data, mass harvesting of information, and un-consented disclosure of personal content. Above all, the main concern is the lack of understanding about data privacy in cloud environments. Currently, the cloud service providers either advocate the principle of third-party doctrine and deny users' rights to protect their data stored in the cloud; or rely the notice-and-choice framework and present users with ambiguous, incomprehensible privacy statements without any meaningful privacy guarantee. In this regard, our research has three main contributions. First, to capture users' privacy expectations in cloud environments, we conceptually divide personal data into two categories, i.e., visible data and invisible data. The visible data refer to information users intentionally create, upload to, and share through the cloud; the invisible data refer to users' information retained in the cloud that is aggregated, analyzed, and repurposed without their knowledge or understanding. Second, to address users' privacy concerns raised by cloud computing, we propose two privacy protection frameworks, namely individual control and use limitation. The individual control framework emphasizes users' capability to govern the access to the visible data stored in the cloud. The use limitation framework emphasizes users' expectation to remain anonymous when the invisible data are aggregated and analyzed by cloud-based data services. Finally, we investigate various techniques to accommodate the new privacy protection frameworks, in the context of four cloud-based data services: personal health record sharing, location-based proximity test, link recommendation for social networks, and face tagging in photo management applications. For the first case, we develop a key-based protection technique to enforce fine-grained access control to users' digital health records. For the second case, we develop a key-less protection technique to achieve location-specific user selection. For latter two cases, we develop distributed learning algorithms to prevent large scale data harvesting. We further combine these algorithms with query regulation techniques to achieve user anonymity. The picture that is emerging from the above works is a bleak one. Regarding to personal data, the reality is we can no longer control them all. As communication technologies evolve, the scope of personal data has expanded beyond local, discrete silos, and integrated into the Internet. The traditional understanding of privacy must be updated to reflect these changes. In addition, because privacy is a particularly nuanced problem that is governed by context, there is no one-size-fit-all solution. While some cases can be salvaged either by cryptography or by other means, in others a rethinking of the trade-offs between utility and privacy appears to be necessary.
Ph. D.
Style APA, Harvard, Vancouver, ISO itp.
23

Mathew, Avin D. "Asset management data warehouse data modelling". Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/19310/1/Avin_Mathew_Thesis.pdf.

Pełny tekst źródła
Streszczenie:
Data are the lifeblood of an organisation, being employed by virtually all business functions within a firm. Data management, therefore, is a critical process in prolonging the life of a company and determining the success of each of an organisation’s business functions. The last decade and a half has seen data warehousing rising in priority within corporate data management as it provides an effective supporting platform for decision support tools. A cross-sectional survey conducted by this research showed that data warehousing is starting to be used within organisations for their engineering asset management, however the industry uptake is slow and has much room for development and improvement. This conclusion is also evidenced by the lack of systematic scholarly research within asset management data warehousing as compared to data warehousing for other business areas. This research is motivated by the lack of dedicated research into asset management data warehousing and attempts to provide original contributions to the area, focussing on data modelling. Integration is a fundamental characteristic of a data warehouse and facilitates the analysis of data from multiple sources. While several integration models exist for asset management, these only cover select areas of asset management. This research presents a novel conceptual data warehousing data model that integrates the numerous asset management data areas. The comprehensive ethnographic modelling methodology involved a diverse set of inputs (including data model patterns, standards, information system data models, and business process models) that described asset management data. Used as an integrated data source, the conceptual data model was verified by more than 20 experts in asset management and validated against four case studies. A large section of asset management data are stored in a relational format due to the maturity and pervasiveness of relational database management systems. Data warehousing offers the alternative approach of structuring data in a dimensional format, which suggests increased data retrieval speeds in addition to reducing analysis complexity for end users. To investigate the benefits of moving asset management data from a relational to multidimensional format, this research presents an innovative relational vs. multidimensional model evaluation procedure. To undertake an equitable comparison, the compared multidimensional are derived from an asset management relational model and as such, this research presents an original multidimensional modelling derivation methodology for asset management relational models. Multidimensional models were derived from the relational models in the asset management data exchange standard, MIMOSA OSA-EAI. The multidimensional and relational models were compared through a series of queries. It was discovered that multidimensional schemas reduced the data size and subsequently data insertion time, decreased the complexity of query conceptualisation, and improved the query execution performance across a range of query types. To facilitate the quicker uptake of these data warehouse multidimensional models within organisations, an alternate modelling methodology was investigated. This research presents an innovative approach of using a case-based reasoning methodology for data warehouse schema design. Using unique case representation and indexing techniques, the system also uses a business vocabulary repository to augment case searching and adaptation. The system was validated through a case-study where multidimensional schema design speed and accuracy was measured. It was found that the case-based reasoning system provided a marginal benefit, with a greater benefits gained when confronted with more difficult scenarios.
Style APA, Harvard, Vancouver, ISO itp.
24

Mathew, Avin D. "Asset management data warehouse data modelling". Queensland University of Technology, 2008. http://eprints.qut.edu.au/19310/.

Pełny tekst źródła
Streszczenie:
Data are the lifeblood of an organisation, being employed by virtually all business functions within a firm. Data management, therefore, is a critical process in prolonging the life of a company and determining the success of each of an organisation’s business functions. The last decade and a half has seen data warehousing rising in priority within corporate data management as it provides an effective supporting platform for decision support tools. A cross-sectional survey conducted by this research showed that data warehousing is starting to be used within organisations for their engineering asset management, however the industry uptake is slow and has much room for development and improvement. This conclusion is also evidenced by the lack of systematic scholarly research within asset management data warehousing as compared to data warehousing for other business areas. This research is motivated by the lack of dedicated research into asset management data warehousing and attempts to provide original contributions to the area, focussing on data modelling. Integration is a fundamental characteristic of a data warehouse and facilitates the analysis of data from multiple sources. While several integration models exist for asset management, these only cover select areas of asset management. This research presents a novel conceptual data warehousing data model that integrates the numerous asset management data areas. The comprehensive ethnographic modelling methodology involved a diverse set of inputs (including data model patterns, standards, information system data models, and business process models) that described asset management data. Used as an integrated data source, the conceptual data model was verified by more than 20 experts in asset management and validated against four case studies. A large section of asset management data are stored in a relational format due to the maturity and pervasiveness of relational database management systems. Data warehousing offers the alternative approach of structuring data in a dimensional format, which suggests increased data retrieval speeds in addition to reducing analysis complexity for end users. To investigate the benefits of moving asset management data from a relational to multidimensional format, this research presents an innovative relational vs. multidimensional model evaluation procedure. To undertake an equitable comparison, the compared multidimensional are derived from an asset management relational model and as such, this research presents an original multidimensional modelling derivation methodology for asset management relational models. Multidimensional models were derived from the relational models in the asset management data exchange standard, MIMOSA OSA-EAI. The multidimensional and relational models were compared through a series of queries. It was discovered that multidimensional schemas reduced the data size and subsequently data insertion time, decreased the complexity of query conceptualisation, and improved the query execution performance across a range of query types. To facilitate the quicker uptake of these data warehouse multidimensional models within organisations, an alternate modelling methodology was investigated. This research presents an innovative approach of using a case-based reasoning methodology for data warehouse schema design. Using unique case representation and indexing techniques, the system also uses a business vocabulary repository to augment case searching and adaptation. The system was validated through a case-study where multidimensional schema design speed and accuracy was measured. It was found that the case-based reasoning system provided a marginal benefit, with a greater benefits gained when confronted with more difficult scenarios.
Style APA, Harvard, Vancouver, ISO itp.
25

Grillo, Aderibigbe. "Developing a data quality scorecard that measures data quality in a data warehouse". Thesis, Brunel University, 2018. http://bura.brunel.ac.uk/handle/2438/17137.

Pełny tekst źródła
Streszczenie:
The main purpose of this thesis is to develop a data quality scorecard (DQS) that aligns the data quality needs of the Data warehouse stakeholder group with selected data quality dimensions. To comprehend the research domain, a general and systematic literature review (SLR) was carried out, after which the research scope was established. Using Design Science Research (DSR) as the methodology to structure the research, three iterations were carried out to achieve the research aim highlighted in this thesis. In the first iteration, as DSR was used as a paradigm, the artefact was build from the results of the general and systematic literature review conduct. A data quality scorecard (DQS) was conceptualised. The result of the SLR and the recommendations for designing an effective scorecard provided the input for the development of the DQS. Using a System Usability Scale (SUS), to validate the usability of the DQS, the results of the first iteration suggest that the DW stakeholders found the DQS useful. The second iteration was conducted to further evaluate the DQS through a run through in the FMCG domain and then conducting a semi-structured interview. The thematic analysis of the semi-structured interviews demonstrated that the stakeholder's participants' found the DQS to be transparent; an additional reporting tool; Integrates; easy to use; consistent; and increases confidence in the data. However, the timeliness data dimension was found to be redundant, necessitating a modification to the DQS. The third iteration was conducted with similar steps as the second iteration but with the modified DQS in the oil and gas domain. The results from the third iteration suggest that DQS is a useful tool that is easy to use on a daily basis. The research contributes to theory by demonstrating a novel approach to DQS design This was achieved by ensuring the design of the DQS aligns with the data quality concern areas of the DW stakeholders and the data quality dimensions. Further, this research lay a good foundation for the future by establishing a DQS model that can be used as a base for further development.
Style APA, Harvard, Vancouver, ISO itp.
26

Kong, Jiantao. "Trusted data path protecting shared data in virtualized distributed systems". Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33820.

Pełny tekst źródła
Streszczenie:
When sharing data across multiple sites, service applications should not be trusted automatically. Services that are suspected of faulty, erroneous, or malicious behaviors, or that run on systems that may be compromised, should not be able to gain access to protected data or entrusted with the same data access rights as others. This thesis proposes a context flow model that controls the information flow in a distributed system. Each service application along with its surrounding context in a distributed system is treated as a controllable principal. This thesis defines a trust-based access control model that controls the information exchange between these principals. An online monitoring framework is used to evaluate the trustworthiness of the service applications and the underlining systems. An external communication interception runtime framework enforces trust-based access control transparently for the entire system.
Style APA, Harvard, Vancouver, ISO itp.
27

Deedman, Galvin Charles. "Building rule-based expert systems in case-based law". Thesis, University of British Columbia, 1987. http://hdl.handle.net/2429/26137.

Pełny tekst źródła
Streszczenie:
This thesis demonstrates that it is possible to build rule-based expert systems in case-based law using a deep-structure analysis of the law and commercially available artificial intelligence tools. Nervous shock, an area of the law of negligence, was the domain chosen. The expert whose knowledge was used to build the system was Professor J.C. Smith of the Faculty of Law at the University of British Columbia
Law, Peter A. Allard School of
Graduate
Style APA, Harvard, Vancouver, ISO itp.
28

Touma, Rizkallah. "Computer-language based data prefetching techniques". Doctoral thesis, Universitat Politècnica de Catalunya, 2019. http://hdl.handle.net/10803/665207.

Pełny tekst źródła
Streszczenie:
Data prefetching has long been used as a technique to improve access times to persistent data. It is based on retrieving data records from persistent storage to main memory before the records are needed. Data prefetching has been applied to a wide variety of persistent storage systems, from file systems to Relational Database Management Systems and NoSQL databases, with the aim of reducing access times to the data maintained by the system and thus improve the execution times of the applications using this data. However, most existing solutions to data prefetching have been based on information that can be retrieved from the storage system itself, whether in the form of heuristics based on the data schema or data access patterns detected by monitoring access to the system. There are multiple disadvantages of these approaches in terms of the rigidity of the heuristics they use, the accuracy of the predictions they make and / or the time they need to make these predictions, a process often performed while the applications are accessing the data and causing considerable overhead. In light of the above, this thesis proposes two novel approaches to data prefetching based on predictions made by analyzing the instructions and statements of the computer languages used to access persistent data. The proposed approaches take into consideration how the data is accessed by the higher-level applications, make accurate predictions and are performed without causing any additional overhead. The first of the proposed approaches aims at analyzing instructions of applications written in object-oriented languages in order to prefetch data from Persistent Object Stores. The approach is based on static code analysis that is done prior to the application execution and hence does not add any overhead. It also includes various strategies to deal with cases that require runtime information unavailable prior to the execution of the application. We integrate this analysis approach into an existing Persistent Object Store and run a series of extensive experiments to measure the improvement obtained by prefetching the objects predicted by the approach. The second approach analyzes statements and historic logs of the declarative query language SPARQL in order to prefetch data from RDF Triplestores. The approach measures two types of similarity between SPARQL queries in order to detect recurring query patterns in the historic logs. Afterwards, it uses the detected patterns to predict subsequent queries and launch them before they are requested to prefetch the data needed by them. Our evaluation of the proposed approach shows that it high-accuracy prediction and can achieve a high cache hit rate when caching the results of the predicted queries.
Precargar datos ha sido una de las técnicas más comunes para mejorar los tiempos de acceso a datos persistentes. Esta técnica se basa en predecir los registros de datos que se van a acceder en el futuro y cargarlos del almacenimiento persistente a la memoria con antelación a su uso. Precargar datos ha sido aplicado en multitud de sistemas de almacenimiento persistente, desde sistemas de ficheros a bases de datos relacionales y NoSQL, con el objetivo de reducir los tiempos de acceso a los datos y por lo tanto mejorar los tiempos de ejecución de las aplicaciones que usan estos datos. Sin embargo, la mayoría de los enfoques existentes utilizan predicciones basadas en información que se encuentra dentro del mismo sistema de almacenimiento, ya sea en forma de heurísticas basadas en el esquema de los datos o patrones de acceso a los datos generados mediante la monitorización del acceso al sistema. Estos enfoques presentan varias desventajas en cuanto a la rigidez de las heurísticas usadas, la precisión de las predicciones generadas y el tiempo que necesitan para generar estas predicciones, un proceso que se realiza con frecuencia mientras las aplicaciones acceden a los datos y que puede tener efectos negativos en el tiempo de ejecución de estas aplicaciones. En vista de lo anterior, esta tesis presenta dos enfoques novedosos para precargar datos basados en predicciones generadas por el análisis de las instrucciones y sentencias del lenguaje informático usado para acceder a los datos persistentes. Los enfoques propuestos toman en consideración cómo las aplicaciones acceden a los datos, generan predicciones precisas y mejoran el rendimiento de las aplicaciones sin causar ningún efecto negativo. El primer enfoque analiza las instrucciones de applicaciones escritas en lenguajes de programación orientados a objetos con el fin de precargar datos de almacenes de objetos persistentes. El enfoque emplea análisis estático de código hecho antes de la ejecución de las aplicaciones, y por lo tanto no afecta negativamente el rendimiento de las mismas. El enfoque también incluye varias estrategias para tratar casos que requieren información de runtime no disponible antes de ejecutar las aplicaciones. Además, integramos este enfoque en un almacén de objetos persistentes y ejecutamos una serie extensa de experimentos para medir la mejora de rendimiento que se puede obtener utilizando el enfoque. Por otro lado, el segundo enfoque analiza las sentencias y logs del lenguaje declarativo de consultas SPARQL para precargar datos de triplestores de RDF. Este enfoque aplica dos medidas para calcular la similtud entre las consultas del lenguaje SPARQL con el objetivo de detectar patrones recurrentes en los logs históricos. Posteriormente, el enfoque utiliza los patrones detectados para predecir las consultas siguientes y precargar con antelación los datos que necesitan. Nuestra evaluación muestra que este enfoque produce predicciones de alta precisión y puede lograr un alto índice de aciertos cuando los resultados de las consultas predichas se guardan en el caché.
Style APA, Harvard, Vancouver, ISO itp.
29

Hable, Robert. "Data-Based Decisions under Complex Uncertainty". Diss., lmu, 2009. http://nbn-resolving.de/urn:nbn:de:bvb:19-98740.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
30

Xu, Gang. "Optimisation-based approaches for data analysis". Thesis, University College London (University of London), 2008. http://discovery.ucl.ac.uk/15947/.

Pełny tekst źródła
Streszczenie:
Recent advances in science and technology promote the generation of a huge amount of data from various sources including scientific experiments, social surveys and practical observations. The availability of powerful computer hardware and software offers easier ways to store datasets. However, more efficient and accurate methodologies are required to analyse datasets and extract useful information from them. This work aims at applying mathematical programming and optimisation methodologies to analyse different forms of datasets. The research focuses on three areas including data classification, community structure identification of complex networks and DNA motif discovery. Firstly, a general data classification problem is investigated. A mixed integer optimisation-based approach is proposed to reveal the patterns hidden behind training data samples using a hyper-box representation. An efficient solution methodology is then developed to extend the applicability of hyper-box classifiers to datasets with many training samples and complex structures. Secondly, the network community structure identification problem is addressed. The proposed mathematical model finds optimal modular structures of complex networks through the maximisation of network modularity metric. Communities of medium/large networks are identified through a two-stage solution algorithm developed in this thesis. Finally, the third part presents an optimisation-based framework to extract DNA motifs and consensus sequences. The problem is formulated as a mixed integer linear programming model and an iterative solution procedure is developed to identify multiple motifs in each DNA sequence. The flexibility of the proposed motif finding approach is then demonstrated to incorporate other biological features.
Style APA, Harvard, Vancouver, ISO itp.
31

Remesan, Renji. "Data based Modelling Issues in Hydroinformatics". Thesis, University of Bristol, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.520188.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
32

Wang, Zhi. "Module-Based Analysis for "Omics" Data". Thesis, North Carolina State University, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3690212.

Pełny tekst źródła
Streszczenie:

This thesis focuses on methodologies and applications of module-based analysis (MBA) in omics studies to investigate the relationships of phenotypes and biomarkers, e.g., SNPs, genes, and metabolites. As an alternative to traditional single–biomarker approaches, MBA may increase the detectability and reproducibility of results because biomarkers tend to have moderate individual effects but significant aggregate effect; it may improve the interpretability of findings and facilitate the construction of follow-up biological hypotheses because MBA assesses biomarker effects in a functional context, e.g., pathways and biological processes. Finally, for exploratory “omics” studies, which usually begin with a full scan of a long list of candidate biomarkers, MBA provides a natural way to reduce the total number of tests, and hence relax the multiple-testing burdens and improve power.

The first MBA project focuses on genetic association analysis that assesses the main and interaction effects for sets of genetic (G) and environmental (E) factors rather than for individual factors. We develop a kernel machine regression approach to evaluate the complete effect profile (i.e., the G, E, and G-by-E interaction effects separately or in combination) and construct a kernel function for the Gene-Environmental (GE) interaction directly from the genetic kernel and the environmental kernel. We use simulation studies and real data applications to show improved performance of the Kernel Machine (KM) regression method over the commonly adapted PC regression methods across a wide range of scenarios. The largest gain in power occurs when the underlying effect structure is involved complex GE interactions, suggesting that the proposed method could be a useful and powerful tool for performing exploratory or confirmatory analyses in GxE-GWAS.

In the second MBA project, we extend the kernel machine framework developed in the first project to model biomarkers with network structure. Network summarizes the functional interplay among biological units; incorporating network information can more precisely model the biological effects, enhance the ability to detect true signals, and facilitate our understanding of the underlying biological mechanisms. In the work, we develop two kernel functions to capture different network structure information. Through simulations and metabolomics study, we show that the proposed network-based methods can have markedly improved power over the approaches ignoring network information.

Metabolites are the end products of cellular processes and reflect the ultimate responses of biology system to genetic variations or environment exposures. Because of the unique properties of metabolites, pharmcometabolomics aims to understand the underlying signatures that contribute to individual variations in drug responses and identify biomarkers that can be helpful to response predictions. To facilitate mining pharmcometabolomic data, we establish an MBA pipeline that has great practical value in detection and interpretation of signatures, which may potentially indicate a functional basis for the drug response. We illustrate the utilities of the pipeline by investigating two scientific questions in aspirin study: (1) which metabolites changes can be attributed to aspirin intake, and (2) what are the metabolic signatures that can be helpful in predicting aspirin resistance. Results show that the MBA pipeline enables us to identify metabolic signatures that are not found in preliminary single-metabolites analysis.

Style APA, Harvard, Vancouver, ISO itp.
33

Pople, Andrew James. "Value-based maintenance using limited data". Thesis, University of Newcastle Upon Tyne, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.391958.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
34

Hiden, Hugo George. "Data-based modelling using genetic programming". Thesis, University of Newcastle Upon Tyne, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.246137.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
35

Lynch, Thomas J. III, Thomas E. Fortmann, Howard Briscoe i Sanford Fidell. "MULTIPROCESSOR-BASED DATA ACQUISITION AND ANALYSIS". International Foundation for Telemetering, 1989. http://hdl.handle.net/10150/614478.

Pełny tekst źródła
Streszczenie:
International Telemetering Conference Proceedings / October 30-November 02, 1989 / Town & Country Hotel & Convention Center, San Diego, California
Multiprocessing computer systems offer several attractive advantages for telemetry-related data acquisition and processing applications. These include: (1) high-bandwidth, fail-soft operation with convenient, low-cost, growth paths, (2) cost-effective integration and clustering of data acquisition, decommutation, monitoring, archiving, analysis, and display processing, and (3) support for modern telemetry system architectures that allow concurrent network access to test data (for both real-time and post-test analyses) by multiple analysts. This paper asserts that today’s general-purpose hardware and software offer viable platforms for these applications. One such system, currently under development, closely couples VME data buses and other off-the-shelf components, parallel processing computers, and commercial data analysis packages to acquire, process, display, and analyze telemetry and other data from a major weapon system. This approach blurs the formerly clear architectural distinction in telemetry data processing systems between special-purpose, front-end, preprocessing hardware and generalpurpose, back-end, host computers used for further processing and display.
Style APA, Harvard, Vancouver, ISO itp.
36

Sandelius, Tim. "Graph-based Visualization of Sensor Data". Thesis, Örebro universitet, Institutionen för naturvetenskap och teknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-94170.

Pełny tekst źródła
Streszczenie:
Att visualisera rörelsedata är ett kraftigt undersökt område och en komplex uppgift. I det här projektet har jag använt rörelsedata insamlad av sensorer från Akademiska hus utplacerade på Örebro Universitetscampus. Datan är använd för att visualisera rörelser gjorda inuti byggnaderna genom en webapplikation skriven med enbart Python. Anslutbarhet mellan sensorer är undersökt huruvida det är möjligt att generera anslutbarhetsgrafer med informationen kopplad till specifika sensorer automatiskt eller för hand. I projektet så undersöks även huruvida rörelseflöden är möjliga att visualisera via den datan tillgängliggjord av Akademiska hus.
Visualizing movement data is a heavily researched area and complex task. In this project I have used movement data collected by sensors from Akademiska hus placed on campus of Örebro University. The data is used to visualize movement made inside the buildings through a developed webapp written entirely in Python. Connectivity between sensors is studied whether it is possible to generate connectivity graphs with the information associated to specific sensors automatically or done by hand. The project also researches whether movement flows are possible to visualize with the data available from Akademiska hus.
Style APA, Harvard, Vancouver, ISO itp.
37

Cheung, Ricky. "Stochastic based football simulation using data". Thesis, Uppsala universitet, Matematiska institutionen, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-359835.

Pełny tekst źródła
Streszczenie:
This thesis is an extension of a football simulator made in a previous project, where we also made different visualizations and simulators based on football data. The goal is to create a football simulator based on a modified Markov chain process, where two teams can be chosen, to simulate entire football matches play-by-play. To validate our model, we compare simulated data with the provided data from Opta. Several adjustments are made to make the simulation as realistic as possible. After conducting a few experiments to compare simulated data with real data before and after adjustments, we conclude that the model may not be adequately accurate to reflect real life matches.
Style APA, Harvard, Vancouver, ISO itp.
38

Veldsman, Werner Pieter. "SNP based literature and data retrieval". Thesis, University of the Western Cape, 2016. http://hdl.handle.net/11394/5345.

Pełny tekst źródła
Streszczenie:
>Magister Scientiae - MSc
Reference single nucleotide polymorphism (refSNP) identifiers are used to earmark SNPs in the human genome. These identifiers are often found in variant call format (VCF) files. RefSNPs can be useful to include as terms submitted to search engines when sourcing biomedical literature. In this thesis, the development of a bioinformatics software package is motivated, planned and implemented as a web application (http://sniphunter.sanbi.ac.za) with an application programming interface (API). The purpose is to allow scientists searching for relevant literature to query a database using refSNP identifiers and potential keywords assigned to scientific literature by the authors. Multiple queries can be simultaneously launched using either the web interface or the API. In addition, a VCF file parser was developed and packaged with the application to allow users to upload, extract and write information from VCF files to a file format that can be interpreted by the novel search engine created during this project. The parsing feature is seamlessly integrated with the web application's user interface, meaning there is no expectation on the user to learn a scripting language. This multi-faceted software system, called SNiPhunter, envisions saving researchers time during life sciences literature procurement, by suggesting articles based on the amount of times a reference SNP identifier has been mentioned in an article. This will allow the user to make a quantitative estimate as to the relevance of an article. A second novel feature is the inclusion of the email address of a correspondence author in the results returned to the user, which promotes communication between scientists. Moreover, links to external functional information are provided to allow researchers to examine annotations associated with their reference SNP identifier of interest. Standard information such as digital object identifiers and publishing dates, that are typically provided by other search engines, are also included in the results returned to the user.
National Research Foundation (NRF) /The South African Research Chairs Initiative (SARChI)
Style APA, Harvard, Vancouver, ISO itp.
39

Kaminski, Kamil. "Data association for object-based SLAM". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-290935.

Pełny tekst źródła
Streszczenie:
The thesis tackles the problem of data association for monocular object-based SLAM, which gets often omitted in related works. A method for estimatingellipsoid object landmark representations is implemented. This method uses bounding box multi-view object detections from 2D images with the help ofYOLOv3 object detector and ORB-SLAM2 for camera pose estimation. The online data association uses SIFT image feature matching and landmark backprojectionmatching against bounding box detections to associate these object detections. This combination and its evaluation is the main contribution of the thesis. The overall algorithm is tested on several datasets, both real-world and computer rendered. The association algorithm manages well on the testedsequences and it is shown that matching with the back projections of the ellipsoidlandmarks improves the robustness of the approach. It is shown that with some implementation changes, the algorithm can run at real-time. The landmarkestimation part works satisfactory for landmark initialization. Based on the findings future work is proposed.
Examensarbetet tar upp problemet med dataassociation för monokulärt objektbaserat SLAM, som ofta utelämnas i relaterade verk. En metod för att uppskatta ellipsoida landmärkesrepresentationer implementeras. Den här metoden använder avgränsningsrutor med flera vyer i 2D-bilder med hjälp av YOLOv3-objektdetektor och ORB-SLAM2 för uppskattning av kamerapositioner. Onlinedataassociationen använder SIFT intressepunktsmatchning samt matchnin gav landmärkesbakprojektioner mot avgränsningsrutorna för att associera dessa objektdetektioner. Denna kombination och dess utvärdering är huvudbidraget i examensarbetet. Den övergripande algoritmen testas på flera datasätt, bådeverkliga och datorgenererade. Associeringsalgoritmen hanterar väl de testa desekvenserna och det visas att matchning med bakprojektionerna av ellipsoidalandmärken gör associeringen mer robust. Det visas att med vissa implementeringsförändringarkan algoritmen köras i realtid. Landmarkeringsberäkningsdelen fungerar acceptabelt för initiering av landmärken. Baserat på resultaten föreslås framtid arbete.
Style APA, Harvard, Vancouver, ISO itp.
40

Kaminski, Kamil. "Data association for object-based SLAM". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291206.

Pełny tekst źródła
Streszczenie:
The thesis tackles the problem of data association for monocular object-basedSLAM, which gets often omitted in related works. A method for estimating ellipsoid object landmark representations is implemented. This method uses bounding box multi-view object detections from 2D images with the help ofYOLOv3 object detector and ORB-SLAM2 for camera pose estimation. The online data association uses SIFT image feature matching and landmark back projection matching against bounding box detections to associate these object detections. This combination and its evaluation is the main contribution of the thesis. The overall algorithm is tested on several datasets, both real-world and computer rendered. The association algorithm manages well on the tested sequences and it is shown that matching with the back projections of the ellipsoid landmarks improves the robustness of the approach. It is shown that with some implementation changes, the algorithm can run at real-time. The landmark estimation part works satisfactory for landmark initialization. Based on the findings future work is proposed.
Examensarbetet tar upp problemet med dataassociation för monokulärt objektbaserat SLAM, som ofta utelämnas i relaterade verk. En metod för att uppskatta ellipsoida landmärkesrepresentationer implementeras. Den här metodenanvänder avgränsningsrutor med flera vyer i 2D-bilder med hjälp av YOLOv3-objektdetektor och ORB-SLAM2 för uppskattning av kamerapositioner. Online dataassociationen använder SIFT intressepunktsmatchning samt matchningav landmärkesbakprojektioner mot avgränsningsrutorna för att associera dessaobjektdetektioner. Denna kombination och dess utvärdering är huvudbidrageti examensarbetet. Den övergripande algoritmen testas på flera datasätt, bådeverkliga och datorgenererade. Associeringsalgoritmen hanterar väl de testadesekvenserna och det visas att matchning med bakprojektionerna av ellipsoidalandmärken gör associeringen mer robust. Det visas att med vissa implementeringsförändringar kan algoritmen köras i realtid. Landmarkeringsberäkningsdelen fungerar acceptabelt för initiering av landmärken. Baserat på resultatenföreslås framtid arbete.
Style APA, Harvard, Vancouver, ISO itp.
41

Tavassoli, Pantea. "Web-based interface for data visualization". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279460.

Pełny tekst źródła
Streszczenie:
In the age of Big data and exponential digitalization, data visualization is becoming a ubiquitous tool to understand trends, patterns and identify deviations that better help in decision making. The purpose of this thesis is to explore how a scalable data visualization interface can be designed with the open-source web library D3.js. The interface is designed to display a range of patients’ physiological measurements to help healthcare professionals with Covid-19 diagnosis. Several prerequisites were identified through a qualitative study, which proved to alleviate the implementation process, such as choosing a robust model that can support visualizations despite discontinuous and incomplete datasets. Since faulty visualizations may lead to potential harm in the highly sensitive medical setting, a dedicated risk analysis was deemed beneficial and thus formulated. The design of the interface also revealed functionality that could be considered when implementing any visualization interface, such as the rendering of different views and features that can further assist the user in interpreting the visualizations.
I en tid med Big Data och en exponentiellt växande digitalisering, blir datavisualisering ett mer förekommande verktyg för att förstå trender, mönster och identifiera avvikelser för att underlätta beslutsfattande. Syftet med studien är att utforska hur ett skalbart datavisualiseringsgränssnitt kan utformas med hjälp av det webbaserade biblioteket D3.js. Gränssnittet är utformat för att visa ett omfång av patienters fysiologiska mätvärden med syftet att hjälpa sjukvårdspersonal med diagnostiken av Covid-19. Flera förutsättningar kunde upptäckas med hjälp av en kvalitativ förstudie. Denna studie visade sig underlätta implementeringsprocessen, där bland annat en robust modell som stödjer visualiseringar trots diskontinuerliga och ofullständiga dataserier identifierades. Eftersom felaktiga, eller delvis fungerande visualiseringar kan leda till potentiell skada i den mycket känsliga medicinska miljön, ansågs en riskanalys vara fördelaktig. Därför utformades en sådan analys, som dessutom visade sig sedan kunna vara användbar i flera sammanhang. Gränssnittets design visade också på gemensam funktionalitet som kan övervägas vid implementeringen av andra visualiseringsgränssnitt, bland annat hur vyer renderas men även funktioner som vägleder användaren till att lättare kunna tolka de olika visualiseringarna.
Style APA, Harvard, Vancouver, ISO itp.
42

Liu, Dan. "Tree-based Models for Longitudinal Data". Bowling Green State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1399972118.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
43

Amaglo, Worlanyo Yao. "Volume Calculation Based on LiDAR Data". Thesis, KTH, Fastigheter och byggande, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299594.

Pełny tekst źródła
Streszczenie:
In this thesis project, the main objective is to compare and evaluate different surveying methodsfor volume determination; Photogrammetry, Terrestrial Laser Scanning(TLS) and Aerial LaserScanning (ALS) based on time consumption, efficiency and safety in the mining industry. In addition,a volumetric computational method based on coordinates would be formulated to estimatethe volume of stockpiles using lidar data captured with a laser scanner. The use of GNSS receiver, UAV (Unmanned Aerial Vehicle) equipped with a LiDAR sensor as wellas a camera, and terrestrial laser scanner were adopted for making measurements on stockpiles. Trimble Business Center and Trimble RealWorks were used in processing LiDAR data from TLSand ALS. Two volume computational approaches were also explored using both TLS and ALSLiDAR data. Agisoft Photoscan was used in processing the images captured adopting the structurefrom motion principle. These softwares were used to estimate the volume of the stockpile. Matlabwas used to estimate the volume of stockpile using LiDAR data. A volume computational methodbased on coordinates of point cloud was implemented. Analysis based on time taken to captureand process all data types till the fi nal product was done. The results obtained from each datacapturing methods were evaluated. Simulated data technology is also adopted in this project asit can be modeled in different ways to study the effect of surface roughness (point density) onvolume estimated. A part of this project explores the use of MATLAB to  filter out unwantedpoint clouds coming from the weeds that grow on the surface of an abandoned stockpile and alsosurface areas that were to be excluded from the volume computation, as in this case. From the results obtained, TLS and ALS do not differ much in the  final volume estimated. Photogrammetry on the other-hand estimated a higher volume as compared to the other surveymethods. MATLAB in estimating the volume of stockpile achieves approximately an equal estimate as that of the TLS and ALS within a short period of time. The point density and fi ltering algorithm playsa critical role in volume computation which helps in providing a good estimate of the stockpile. Findings from this project show that is it time consuming to estimate the volume of stockpileusing TLS and Photogrammetric approach. In terms of safety on an active mining site, these twosurvey method have high risk probability as compared to the ALS approach. The accuracy forthe data captured and processed can be said to be satisfactory for each survey method.
I detta avhandlingsprojekt var huvudmålet att jämföra och utvärdera tre kartläggningsmetoder för volymbestämning: Fotogrammetri, Terrestrial Laser Scanning (TLS) och Aerial Laser Scan-ning (ALS) baserat på tidsförbrukning, effektivitet och säkerhet i gruvindustrin. Dessutom formulerades en volymetrisk beräkningsmetod baserad på koordinater för att uppskatta volymen av lager med hjälp av lidardata som fångats med en laserskanner. Användningen av GNSS-mottagare, UAV (obemannad flygbil) utrustad med en LiDAR-sensor samt en kamera och markbunden laserscanner antogs för att göra mätningar på lager. Trimble Business Center och Trimble RealWorks användes vid bearbetning av LiDAR-data från TLS och ALS. Två volymberäkningsmetoder undersöktes också med både TLS- och ALS LiDAR-data. Agisoft Photoscan användes vid bearbetning av de bilder som tagits och antagit strukturen från rörelseprincipen. Denna programvara användes för att uppskatta volymen på lagret. Matlab användes för att uppskatta volymen av lager med LiDAR-data. En volymberäkningsmetod baserad på koordinater för punktmoln implementerades i Matlab. Analys baserad på den tid det tar att fånga och bearbeta alla datatyper tills den slutliga produkten var klar. Resultaten från varje datafångstmetod utvärderades. Simulerad datateknik antas också i detta projekt eftersom den kan modelleras på olika sätt för att studera effekten av ytjämnhet (punkttäthet) på den uppskattade volymen. En del av detta projekt utforskar användningen av MATLAB för att filtrera bort oönskade punktmoln som kommer från ogräset som växer på ytan av ett övergivet lager och även ytarealer som skulle uteslutas från volymberäkningen, som i detta fall. Från de erhållna resultaten skiljer sig TLS och ALS inte mycket i de slutliga volymuppskattningarna. Fotogrammetri å andra sidan uppskattade en högre volym jämfört med de andra undersöknings-metoderna. MATLAB vid uppskattning av lagervolymen uppnår ungefär lika stor uppskattning som TLS och ALS inom en kort tidsperiod. Punkttätheten och filtreringsalgoritmen spelar en viktig roll i volymberäkning som hjälper till att ge en bra uppskattning av lagret. Resultat från detta projekt visar att det är tidskrävande att uppskatta lagervolymen med TLS och fotogram-metrisk metod. När det gäller säkerhet på en aktiv gruvplats har dessa två undersökningsmetoder hög risk sannolikhet jämfört med ALS-metoden. Noggrannheten för de insamlade och bearbetade uppgifterna kan sägas vara tillfredsställande för varje undersökningsmetod.
Style APA, Harvard, Vancouver, ISO itp.
44

Venkatesh, Prabhu. "Radio frequency-based data collection network". Master's thesis, This resource online, 1995. http://scholar.lib.vt.edu/theses/available/etd-01262010-020232/.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
45

Björck, Olof. "Analyzing gyro data based image registration". Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-397459.

Pełny tekst źródła
Streszczenie:
An analysis of gyro sensor data with regards to rotational image registration is conducted in this thesis. This is relevant for understanding how well images captured with a moving camera can be registered using only gyro sensor data as motion input. This is commonly the case for Electronic Image Stabilization (EIS) in handheld devices. The theory explaining how to register images based on gyro sensor data is presented, a qualitative analysis of gyro sensor data from three generic Android smartphones is conducted, and rotational image registration simulations using simulated noise as well as real gyro sensor data from the smartphones are presented. An accuracy metric for rotational image registration is presented that measures image registration accuracy in pixels (relevant for frame to frame image registration) or pixels per second (relevant for video EIS). This thesis shows that noise in gyro sensor data affects image registration accuracy to an extent that is noticeable in 1080x1920 resolution video displayed on larger monitors such as a computer monitor or when zooming digitally, but not to any significant extent displayed on a monitor the size of a regular smartphone display without zooming. Different screen resolutions and frame rates will affect the image registration accuracy and would be interesting to investigate in further work. Ways to improve the gyro sensor data would also be interesting to investigate.
Style APA, Harvard, Vancouver, ISO itp.
46

FIGUEROA, MARTINEZ CRISTHIAN NICOLAS. "Recommender Systems based on Linked Data". Doctoral thesis, Politecnico di Torino, 2017. http://hdl.handle.net/11583/2669963.

Pełny tekst źródła
Streszczenie:
Backgrounds: The increase in the amount of structured data published using the principles of Linked Data, means that now it is more likely to find resources in the Web of Data that describe real life concepts. However, discovering resources related to any given resource is still an open research area. This thesis studies Recommender Systems (RS) that use Linked Data as a source for generating recommendations exploiting the large amount of available resources and the relationships among them. Aims: The main objective of this study was to propose a recommendation tech- nique for resources considering semantic relationships between concepts from Linked Data. The specific objectives were: (i) Define semantic relationships derived from resources taking into account the knowledge found in Linked Data datasets. (ii) Determine semantic similarity measures based on the semantic relationships derived from resources. (iii) Propose an algorithm to dynami- cally generate automatic rankings of resources according to defined similarity measures. Methodology: It was based on the recommendations of the Project management Institute and the Integral Model for Engineering Professionals (Universidad del Cauca). The first one for managing the project, and the second one for developing the experimental prototype. Accordingly, the main phases were: (i) Conceptual base generation for identifying the main problems, objectives and the project scope. A Systematic Literature Review was conducted for this phase, which highlighted the relationships and similarity measures among resources in Linked Data, and the main issues, features, and types of RS based on Linked Data. (ii) Solution development is about designing and developing the experimental prototype for testing the algorithms studied in this thesis. Results: The main results obtained were: (i) The first Systematic Literature Re- view on RS based on Linked Data. (ii) A framework to execute and an- alyze recommendation algorithms based on Linked Data. (iii) A dynamic algorithm for resource recommendation based on on the knowledge of Linked Data relationships. (iv) A comparative study of algorithms for RS based on Linked Data. (v) Two implementations of the proposed framework. One with graph-based algorithms and other with machine learning algorithms. (vi) The application of the framework to various scenarios to demonstrate its feasibility within the context of real applications. Conclusions: (i) The proposed framework demonstrated to be useful for develop- ing and evaluating different configurations of algorithms to create novel RS based on Linked Data suitable to users’ requirements, applications, domains and contexts. (ii) The layered architecture of the proposed framework is also useful towards the reproducibility of the results for the research community. (iii) Linked data based RS are useful to present explanations of the recommen- dations, because of the graph structure of the datasets. (iv) Graph-based algo- rithms take advantage of intrinsic relationships among resources from Linked Data. Nevertheless, their execution time is still an open issue. Machine Learn- ing algorithms are also suitable, they provide functions useful to deal with large amounts of data, so they can help to improve the performance (execution time) of the RS. However most of them need a training phase that require to know a priory the application domain in order to obtain reliable results. (v) A log- ical evolution of RS based on Linked Data is the combination of graph-based with machine learning algorithms to obtain accurate results while keeping low execution times. However, research and experimentation is still needed to ex- plore more techniques from the vast amount of machine learning algorithms to determine the most suitable ones to deal with Linked Data.
Style APA, Harvard, Vancouver, ISO itp.
47

Kis, Filip. "Prototyping with Data : Opportunistic Development of Data-Driven Interactive Applications". Doctoral thesis, KTH, Medieteknik och interaktionsdesign, MID, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-196851.

Pełny tekst źródła
Streszczenie:
There is a growing amount of digital information available from Open-Data initiatives, Internet-of-Things technologies, and web APIs in general. At the same time, an increasing amount of technology in our lives is creating a desire to take advantage of the generated data for personal or professional interests. Building interactive applications that would address this desire is challenging since it requires advanced engineering skills that are normally reserved for professional software developers. However, more and more interactive applications are prototyped outside of enterprise environments, in more opportunistic settings. For example, knowledge workers apply end-user development techniques to solve their tasks, or groups of friends get together for a weekend hackathon in the hope of becoming the next big startup. This thesis focuses on how to design prototyping tools that support opportunistic development of interactive applications that take advantage of the growing amount of available data. In particular, the goal of this thesis is to understand what are the current challenges of prototyping with data and to identify important qualities of tools addressing these challenges. To accomplish this, declarative development tools were explored, while keeping focus on what data and interaction the application should afford rather than on how they should be implemented (programmed). The work presented in this thesis was carried out as an iterative process which started with a design exploration of Model-based UI Development, followed by observations of prototyping practices through a series of hackathon events and an iterative design of Endev – a prototyping tool for data-driven web applications. Formative evaluations of Endev were conducted with programmers and interaction designers.  The main results of this thesis are the identified challenges for prototyping with data and the key qualities required of prototyping tools that aim to address these challenges. The identified key qualities that lower the threshold for prototyping with data are: declarative prototyping, familiar and setup-free environment, and support tools. Qualities that raise the ceiling for what can be prototyped are: support for heterogeneous data and for advanced look and feel.
Mer och mer digital information görs tillgänglig på olika sätt, t.ex. via öppna data-initiativ, Sakernas internet och API:er. Med en ökande teknikanvändning så skapas även ett intresse för att använda denna data i olika sammanhang, både privat och professionellt. Att bygga interaktiva applikationer som adresserar dessa intressen är en utmaning då det kräver avancerade ingenjörskunskaper, vilket man vanligtvis endast hittar hos professionella programmerare. Sam­tidigt byggs allt fler interaktiva applikationer utanför företagsmiljöer, i mer opportunistiska sammanhang. Detta kan till exempel vara kunskapsarbetare som använder sig av s.k. anveckling (eng. end-user development) för att lösa en uppgift, eller kompisar som träffas en helg för att hålla ett hackaton med hopp om att bli nästa framgångsrika startup-företag. Den här avhandlingen fokuserar på hur prototypverktyg kan utformas för att stödja utveckling av interaktiva applikationer i sådana opportunistiska sammanhang, som utnyttjar den ökande mängden av tillgänglig data. Målet med arbetet som presenteras i den här avhandlingen har varit att förstå utmaningarna som det innebär att använda data i prototyparbete och att identifiera viktiga kvalitéer för de verktyg som ska kunna hantera detta. För att uppnå detta mål har verktyg för deklarativ programmering utforskats med ett fokus kring vilken data och interaktion en applikationen ska erbjuda snarare än hur den ska implementeras. Arbetet som presenteras i den här avhandlingen har genomförts som en iterativ process, med en startpunkt i en utforskning av modellbaserad gränssnittsutveckling, vilket sedan följdes av observationer av prototyparbete i praktiken genom en serie hackathon och en iterativ design av Endev, som är ett prototypverktyg för att skapa datadrivna webbapplikationer. Formativa utvärderingar av Endev har utförts med programmerare och interaktionsdesigners. De viktigaste resultaten av den här avhandlingen är de utmaningar som har identifierats kring hur man skapar prototyper och de kvalitéer som krävs av prototypverktyg som ska adressera dessa utmaningar. De identifierade kvalitéerna som sänker trösklarna för att inkludera data i prototyper är: deklarativt prototyparbete, välbekanta och installationsfria miljöer, och supportverktyg. Kvalitéer som höjer taket för vad som kan göras i en prototyp är: stöd för olika typer av data och för avancerad “look and feel”.
Style APA, Harvard, Vancouver, ISO itp.
48

Duan, Yuanyuan. "Statistical Predictions Based on Accelerated Degradation Data and Spatial Count Data". Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/56616.

Pełny tekst źródła
Streszczenie:
This dissertation aims to develop methods for statistical predictions based on various types of data from different areas. We focus on applications from reliability and spatial epidemiology. Chapter 1 gives a general introduction of statistical predictions. Chapters 2 and 3 investigate the photodegradation of an organic coating, which is mainly caused by ultraviolet (UV) radiation but also affected by environmental factors, including temperature and humidity. In Chapter 2, we identify a physically motivated nonlinear mixed-effects model, including the effects of environmental variables, to describe the degradation path. Unit-to-unit variabilities are modeled as random effects. The maximum likelihood approach is used to estimate parameters based on the accelerated test data from laboratory. The developed model is then extended to allow for time-varying covariates and is used to predict outdoor degradation where the explanatory variables are time-varying. Chapter 3 introduces a class of models for analyzing degradation data with dynamic covariate information. We use a general path model with random effects to describe the degradation paths and a vector time series model to describe the covariate process. Shape restricted splines are used to estimate the effects of dynamic covariates on the degradation process. The unknown parameters of these models are estimated by using the maximum likelihood method. Algorithms for computing the estimated lifetime distribution are also described. The proposed methods are applied to predict the photodegradation path of an organic coating in a complicated dynamic environment. Chapter 4 investigates the Lyme disease emergency in Virginia at census tract level. Based on areal (census tract level) count data of Lyme disease cases in Virginia from 1998 to 2011, we analyze the spatial patterns of the disease using statistical smoothing techniques. We also use the space and space-time scan statistics to reveal the presence of clusters in the spatial and spatial/temporal distribution of Lyme disease. Chapter 5 builds a predictive model for Lyme disease based on historical data and environmental/demographical information of each census tract. We propose a Divide-Recombine method to take advantage of parallel computing. We compare prediction results through simulation studies, which show our method can provide comparable fitting and predicting accuracy but can achieve much more computational efficiency. We also apply the proposed method to analyze Virginia Lyme disease spatio-temporal data. Our method makes large-scale spatio-temporal predictions possible. Chapter 6 gives a general review on the contributions of this dissertation, and discusses directions for future research.
Ph. D.
Style APA, Harvard, Vancouver, ISO itp.
49

Su, Yu. "Big Data Management Framework based on Virtualization and Bitmap Data Summarization". The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1420738636.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
50

Hsu, Pei-Lun. "Machine Learning-Based Data-Driven Traffic Flow Estimation from Mobile Data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-300712.

Pełny tekst źródła
Streszczenie:
Comprehensive information on traffic flow is essential for vehicular emission monitoring and traffic control. However, such information is not observable everywhere and anytime on the road because of high installation costs and malfunctions of stationary sensors. In order to compensate for stationary sensors’ weakness, this thesis analyses an approach for inferring traffic flows from mobile data provided by INRIX, a commercial crowd-sourced traffic dataset with wide spatial coverage and high quality. The idea is to develop Artificial Neural Network (ANN)-based models to automatically extract relations between traffic flow and INRIX measurements, e.g., speed and travel time, from historical data considering temporal and spatial dependencies. We conducted experiments using four weeks of data from INRIX and stationary sensors on two adjacent road segments on the E4 highway in Stockholm. Models are validated via traffic flow estimation based on one week of INRIX data. Compared with the traditional approach that fits the stationary flow-speed relationship based on the multi-regime model, the new approach greatly improves the estimation accuracy. Moreover, the results indicate that the new approach’s models have better resistance to the drift of input variables and can decrease the deterioration of estimation accuracy on the road segment without a stationary sensor. Hence, the new approach may be more appropriate for estimating traffic flows on the nearby road segments of a stationary sensor. The approach provides a highly automated means to build models adaptive to datasets and improves estimation and imputation accuracy. It can also easily integrate new data sources to improve the models. Therefore, it is very suitable to be applied to Intelligent Transport Systems (ITS) for traffic monitor and control in the context of the Internet of Things (IoT) and Big Data.
Information om trafikflödet är nödvändig för övervakning av fordonsutsläpp och trafikstyrning. Trafikflöden kan dock inte observeras överallt och när som helst på vägen på grund av höga installationskostnader och t.ex. funktionsstörningar hos stationära sensorer. För att kompensera för stationära sensorers svagheter analyseras i detta arbete ett tillvägagångssätt för att estimera trafikflöden från mobila data som tillhandahålls av INRIX. Detta kommersiella dataset innehåller restider som kommer från användare av bl.a. färdnavigatorer i fordon och som har en bred rumslig täckning och hög kvalitet. Idén är att utveckla modeller baserade på artificiellt neuronnät för att automatiskt extrahera samband mellan trafikflödesdata och restidsdata från INRIX-mätningarna baserat på historiska data och med hänsyn till tidsmässiga och rumsliga beroenden. Vi utförde experiment med fyra veckors data från INRIX och från stationära sensorer på två intilliggande vägsegment på E4:an i Stockholm. Modellerna valideras med hjälp av estimering av trafikflöde baserat på en veckas INRIX- data. Jämfört med det traditionella tillvägagångssättet som anpassar stationära samband mellan trafikflöde och hastighet baserat på fundamentaldiagram, förbättrar det nya tillvägagångssättet noggrannheten avsevärt. Dessutom visar resultaten att modellerna i den nya metoden bättre hanterar avvikelser i ingående variabler och kan öka noggrannheten på estimatet för vägsegmentet utan stationär sensor. Den nya metoden kan därför vara lämplig för att uppskatta trafikflöden på vägsegment närliggande en stationär sensor. Metodiken ger ett automatiserat sätt att bygga modeller som är anpassade till datamängderna och som förbättrar noggrannheten vid estimering av trafikflöden. Den kan också enkelt integrera nya datakällor. Metodiken är lämplig att tillämpa på tillämpningar inom intelligenta transportsystem för trafikövervakning och trafikstyrning.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii