Academic literature on the topic 'Data integration'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Data integration.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Data integration"

1

Arputhamary, B., and L. Arockiam. "Data Integration in Big Data Environment." Bonfring International Journal of Data Mining 5, no. 1 (February 10, 2015): 01–05. http://dx.doi.org/10.9756/bijdm.8001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Samrat Medavarapu, Sachin. "XML-Based Data Integration." International Journal of Science and Research (IJSR) 13, no. 8 (August 5, 2024): 1984–86. http://dx.doi.org/10.21275/sr24810074326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Vaishnawi, Chittamuru, and Dr Bhuvana J. "Renewable Energy Integration in Cloud Data Centers." International Journal of Research Publication and Reviews 5, no. 3 (March 9, 2024): 2346–54. http://dx.doi.org/10.55248/gengpi.5.0324.0737.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Olmsted, Aspen. "Heterogeneous system integration data integration guarantees." Journal of Computational Methods in Sciences and Engineering 17 (January 19, 2017): S85—S94. http://dx.doi.org/10.3233/jcm-160682.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

CALVANESE, DIEGO, GIUSEPPE DE GIACOMO, MAURIZIO LENZERINI, DANIELE NARDI, and RICCARDO ROSATI. "DATA INTEGRATION IN DATA WAREHOUSING." International Journal of Cooperative Information Systems 10, no. 03 (September 2001): 237–71. http://dx.doi.org/10.1142/s0218843001000345.

Full text
Abstract:
Information integration is one of the most important aspects of a Data Warehouse. When data passes from the sources of the application-oriented operational environment to the Data Warehouse, possible inconsistencies and redundancies should be resolved, so that the warehouse is able to provide an integrated and reconciled view of data of the organization. We describe a novel approach to data integration in Data Warehousing. Our approach is based on a conceptual representation of the Data Warehouse application domain, and follows the so-called local-as-view paradigm: both source and Data Warehouse relations are defined as views over the conceptual model. We propose a technique for declaratively specifying suitable reconciliation correspondences to be used in order to solve conflicts among data in different sources. The main goal of the method is to support the design of mediators that materialize the data in the Data Warehouse relations. Starting from the specification of one such relation as a query over the conceptual model, a rewriting algorithm reformulates the query in terms of both the source relations and the reconciliation correspondences, thus obtaining a correct specification of how to load the data in the materialized view.
APA, Harvard, Vancouver, ISO, and other styles
6

NASSIRI, Hassana. "Data Model Integration." International Journal of New Computer Architectures and their Applications 7, no. 2 (2017): 45–49. http://dx.doi.org/10.17781/p002327.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Miller, Renée J. "Open data integration." Proceedings of the VLDB Endowment 11, no. 12 (August 2018): 2130–39. http://dx.doi.org/10.14778/3229863.3240491.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Dong, Xin Luna, and Divesh Srivastava. "Big data integration." Proceedings of the VLDB Endowment 6, no. 11 (August 27, 2013): 1188–89. http://dx.doi.org/10.14778/2536222.2536253.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Dong, Xin Luna, and Divesh Srivastava. "Big Data Integration." Synthesis Lectures on Data Management 7, no. 1 (February 15, 2015): 1–198. http://dx.doi.org/10.2200/s00578ed1v01y201404dtm040.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Vargas-Vera, Maria. "Data Integration Framework." International Journal of Knowledge Society Research 7, no. 1 (January 2016): 99–112. http://dx.doi.org/10.4018/ijksr.2016010107.

Full text
Abstract:
This paper presents a proposal for a data integration framework. The purpose of the framework is to locate automatically records of participants from the ALSPAC database (Avon Longitudinal Study of Parents and Children) within its counterpart GPRD database (General Practice Research Database). The ALSPAC database is a collection of data from children and parents from before birth to late puberty. This collection contains several variables of interest for clinical researchers but we concentrate in asthma as a golden standard for evaluation of asthma has been made by a clinical researcher. The main component of the framework is a module called Mapper which locates similar records and performs record linkage. The mapper contains a library of similarity measures such Jaccard, Jaro-Winkler, Monge-Elkan, MatchScore, Levenstein and TFIDF similarity. Finally, the author evaluates the approach on quality of the mappings.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Data integration"

1

Nadal, Francesch Sergi. "Metadata-driven data integration." Doctoral thesis, Universitat Politècnica de Catalunya, 2019. http://hdl.handle.net/10803/666947.

Full text
Abstract:
Data has an undoubtable impact on society. Storing and processing large amounts of available data is currently one of the key success factors for an organization. Nonetheless, we are recently witnessing a change represented by huge and heterogeneous amounts of data. Indeed, 90% of the data in the world has been generated in the last two years. Thus, in order to carry on these data exploitation tasks, organizations must first perform data integration combining data from multiple sources to yield a unified view over them. Yet, the integration of massive and heterogeneous amounts of data requires revisiting the traditional integration assumptions to cope with the new requirements posed by such data-intensive settings. This PhD thesis aims to provide a novel framework for data integration in the context of data-intensive ecosystems, which entails dealing with vast amounts of heterogeneous data, from multiple sources and in their original format. To this end, we advocate for an integration process consisting of sequential activities governed by a semantic layer, implemented via a shared repository of metadata. From an stewardship perspective, this activities are the deployment of a data integration architecture, followed by the population of such shared metadata. From a data consumption perspective, the activities are virtual and materialized data integration, the former an exploratory task and the latter a consolidation one. Following the proposed framework, we focus on providing contributions to each of the four activities. We begin proposing a software reference architecture for semantic-aware data-intensive systems. Such architecture serves as a blueprint to deploy a stack of systems, its core being the metadata repository. Next, we propose a graph-based metadata model as formalism for metadata management. We focus on supporting schema and data source evolution, a predominant factor on the heterogeneous sources at hand. For virtual integration, we propose query rewriting algorithms that rely on the previously proposed metadata model. We additionally consider semantic heterogeneities in the data sources, which the proposed algorithms are capable of automatically resolving. Finally, the thesis focuses on the materialized integration activity, and to this end, proposes a method to select intermediate results to materialize in data-intensive flows. Overall, the results of this thesis serve as contribution to the field of data integration in contemporary data-intensive ecosystems.
Les dades tenen un impacte indubtable en la societat. La capacitat d’emmagatzemar i processar grans quantitats de dades disponibles és avui en dia un dels factors claus per l’èxit d’una organització. No obstant, avui en dia estem presenciant un canvi representat per grans volums de dades heterogenis. En efecte, el 90% de les dades mundials han sigut generades en els últims dos anys. Per tal de dur a terme aquestes tasques d’explotació de dades, les organitzacions primer han de realitzar una integració de les dades, combinantles a partir de diferents fonts amb l’objectiu de tenir-ne una vista unificada d’elles. Per això, aquest fet requereix reconsiderar les assumpcions tradicionals en integració amb l’objectiu de lidiar amb els requisits imposats per aquests sistemes de tractament massiu de dades. Aquesta tesi doctoral té com a objectiu proporcional un nou marc de treball per a la integració de dades en el context de sistemes de tractament massiu de dades, el qual implica lidiar amb una gran quantitat de dades heterogènies, provinents de múltiples fonts i en el seu format original. Per això, proposem un procés d’integració compost d’una seqüència d’activitats governades per una capa semàntica, la qual és implementada a partir d’un repositori de metadades compartides. Des d’una perspectiva d’administració, aquestes activitats són el desplegament d’una arquitectura d’integració de dades, seguit per la inserció d’aquestes metadades compartides. Des d’una perspectiva de consum de dades, les activitats són la integració virtual i materialització de les dades, la primera sent una tasca exploratòria i la segona una de consolidació. Seguint el marc de treball proposat, ens centrem en proporcionar contribucions a cada una de les quatre activitats. La tesi inicia proposant una arquitectura de referència de software per a sistemes de tractament massiu de dades amb coneixement semàntic. Aquesta arquitectura serveix com a planell per a desplegar un conjunt de sistemes, sent el repositori de metadades al seu nucli. Posteriorment, proposem un model basat en grafs per a la gestió de metadades. Concretament, ens centrem en donar suport a l’evolució d’esquemes i fonts de dades, un dels factors predominants en les fonts de dades heterogènies considerades. Per a l’integració virtual, proposem algorismes de rescriptura de consultes que usen el model de metadades previament proposat. Com a afegitó, considerem heterogeneïtat semàntica en les fonts de dades, les quals els algorismes de rescriptura poden resoldre automàticament. Finalment, la tesi es centra en l’activitat d’integració materialitzada. Per això proposa un mètode per a seleccionar els resultats intermedis a materialitzar un fluxes de tractament intensiu de dades. En general, els resultats d’aquesta tesi serveixen com a contribució al camp d’integració de dades en els ecosistemes de tractament massiu de dades contemporanis
Les données ont un impact indéniable sur la société. Le stockage et le traitement de grandes quantités de données disponibles constituent actuellement l’un des facteurs clés de succès d’une entreprise. Néanmoins, nous assistons récemment à un changement représenté par des quantités de données massives et hétérogènes. En effet, 90% des données dans le monde ont été générées au cours des deux dernières années. Ainsi, pour mener à bien ces tâches d’exploitation des données, les organisations doivent d’abord réaliser une intégration des données en combinant des données provenant de sources multiples pour obtenir une vue unifiée de ces dernières. Cependant, l’intégration de quantités de données massives et hétérogènes nécessite de revoir les hypothèses d’intégration traditionnelles afin de faire face aux nouvelles exigences posées par les systèmes de gestion de données massives. Cette thèse de doctorat a pour objectif de fournir un nouveau cadre pour l’intégration de données dans le contexte d’écosystèmes à forte intensité de données, ce qui implique de traiter de grandes quantités de données hétérogènes, provenant de sources multiples et dans leur format d’origine. À cette fin, nous préconisons un processus d’intégration constitué d’activités séquentielles régies par une couche sémantique, mise en oeuvre via un dépôt partagé de métadonnées. Du point de vue de la gestion, ces activités consistent à déployer une architecture d’intégration de données, suivies de la population de métadonnées partagées. Du point de vue de la consommation de données, les activités sont l’intégration de données virtuelle et matérialisée, la première étant une tâche exploratoire et la seconde, une tâche de consolidation. Conformément au cadre proposé, nous nous attachons à fournir des contributions à chacune des quatre activités. Nous commençons par proposer une architecture logicielle de référence pour les systèmes de gestion de données massives et à connaissance sémantique. Une telle architecture consiste en un schéma directeur pour le déploiement d’une pile de systèmes, le dépôt de métadonnées étant son composant principal. Ensuite, nous proposons un modèle de métadonnées basé sur des graphes comme formalisme pour la gestion des métadonnées. Nous mettons l’accent sur la prise en charge de l’évolution des schémas et des sources de données, facteur prédominant des sources hétérogènes sous-jacentes. Pour l’intégration virtuelle, nous proposons des algorithmes de réécriture de requêtes qui s’appuient sur le modèle de métadonnées proposé précédemment. Nous considérons en outre les hétérogénéités sémantiques dans les sources de données, que les algorithmes proposés sont capables de résoudre automatiquement. Enfin, la thèse se concentre sur l’activité d’intégration matérialisée et propose à cette fin une méthode de sélection de résultats intermédiaires à matérialiser dans des flux des données massives. Dans l’ensemble, les résultats de cette thèse constituent une contribution au domaine de l’intégration des données dans les écosystèmes contemporains de gestion de données massives
APA, Harvard, Vancouver, ISO, and other styles
2

Jakonienė, Vaida. "Integration of biological data /." Linköping : Linköpings universitet, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-7484.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Akeel, Fatmah Y. "Secure data integration systems." Thesis, University of Southampton, 2017. https://eprints.soton.ac.uk/415716/.

Full text
Abstract:
As the web moves increasingly towards publishing data, a significant challenge arises when integrating data from diverse sources that have heterogeneous security and privacy policies and requirements. Data Integration Systems (DIS) are concerned with integrating data from multiple data sources to resolve users' queries. DIS are prone to data leakage threats, e.g. unauthorised disclosure or secondary use of the data, that compromise the data's confidentiality and privacy. We claim that these threats are caused by the failure to implement or correctly employ confidentiality and privacy techniques, and by the failure to consider the trust levels of system entities, from the very start of system development. Data leakage also results from a failure to capture or implement the security policies imposed by the data providers on the collection, processing, and disclosure of personal and sensitive data. This research proposes a novel framework, called SecureDIS, to mitigate data leakage threats in DIS. Unlike existing approaches that secure such systems, SecureDIS helps software engineers to lessen data leakage threats during the early phases of DIS development. It comprises six components that represent a conceptualised DIS architecture: data and data sources, security policies, integration approach, integration location, data consumers, and System Security Management (SSM). Each component contains a set of informal guidelines written in natural language to be used by software engineers who build and design a DIS that handles sensitive and personal data. SecureDIS has undergone two rounds of review by experts to conrm its validity, resulting in the guidelines being evaluated and extended. Two approaches were adopted to ensure that SecureDIS is suitable for software engineers. The first was to formalise the guidelines by modelling a DIS with the SecureDIS security policies using Event-B formal methods. This verified the correctness and consistency of the model. The second approach assessed SecureDIS's applicability to a real data integration project by using a case study. The case study addressed the experts' concerns regarding the ability to apply the proposed guidelines in practice.
APA, Harvard, Vancouver, ISO, and other styles
4

Eberius, Julian. "Query-Time Data Integration." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-191560.

Full text
Abstract:
Today, data is collected in ever increasing scale and variety, opening up enormous potential for new insights and data-centric products. However, in many cases the volume and heterogeneity of new data sources precludes up-front integration using traditional ETL processes and data warehouses. In some cases, it is even unclear if and in what context the collected data will be utilized. Therefore, there is a need for agile methods that defer the effort of integration until the usage context is established. This thesis introduces Query-Time Data Integration as an alternative concept to traditional up-front integration. It aims at enabling users to issue ad-hoc queries on their own data as if all potential other data sources were already integrated, without declaring specific sources and mappings to use. Automated data search and integration methods are then coupled directly with query processing on the available data. The ambiguity and uncertainty introduced through fully automated retrieval and mapping methods is compensated by answering those queries with ranked lists of alternative results. Each result is then based on different data sources or query interpretations, allowing users to pick the result most suitable to their information need. To this end, this thesis makes three main contributions. Firstly, we introduce a novel method for Top-k Entity Augmentation, which is able to construct a top-k list of consistent integration results from a large corpus of heterogeneous data sources. It improves on the state-of-the-art by producing a set of individually consistent, but mutually diverse, set of alternative solutions, while minimizing the number of data sources used. Secondly, based on this novel augmentation method, we introduce the DrillBeyond system, which is able to process Open World SQL queries, i.e., queries referencing arbitrary attributes not defined in the queried database. The original database is then augmented at query time with Web data sources providing those attributes. Its hybrid augmentation/relational query processing enables the use of ad-hoc data search and integration in data analysis queries, and improves both performance and quality when compared to using separate systems for the two tasks. Finally, we studied the management of large-scale dataset corpora such as data lakes or Open Data platforms, which are used as data sources for our augmentation methods. We introduce Publish-time Data Integration as a new technique for data curation systems managing such corpora, which aims at improving the individual reusability of datasets without requiring up-front global integration. This is achieved by automatically generating metadata and format recommendations, allowing publishers to enhance their datasets with minimal effort. Collectively, these three contributions are the foundation of a Query-time Data Integration architecture, that enables ad-hoc data search and integration queries over large heterogeneous dataset collections.
APA, Harvard, Vancouver, ISO, and other styles
5

Jakonienė, Vaida. "Integration of Biological Data." Doctoral thesis, Linköpings universitet, IISLAB - Laboratoriet för intelligenta informationssystem, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-7484.

Full text
Abstract:
Data integration is an important procedure underlying many research tasks in the life sciences, as often multiple data sources have to be accessed to collect the relevant data. The data sources vary in content, data format, and access methods, which often vastly complicates the data retrieval process. As a result, the task of retrieving data requires a great deal of effort and expertise on the part of the user. To alleviate these difficulties, various information integration systems have been proposed in the area. However, a number of issues remain unsolved and new integration solutions are needed. The work presented in this thesis considers data integration at three different levels. 1) Integration of biological data sources deals with integrating multiple data sources from an information integration system point of view. We study properties of biological data sources and existing integration systems. Based on the study, we formulate requirements for systems integrating biological data sources. Then, we define a query language that supports queries commonly used by biologists. Also, we propose a high-level architecture for an information integration system that meets a selected set of requirements and that supports the specified query language. 2) Integration of ontologies deals with finding overlapping information between ontologies. We develop and evaluate algorithms that use life science literature and take the structure of the ontologies into account. 3) Grouping of biological data entries deals with organizing data entries into groups based on the computation of similarity values between the data entries. We propose a method that covers the main steps and components involved in similarity-based grouping procedures. The applicability of the method is illustrated by a number of test cases. Further, we develop an environment that supports comparison and evaluation of different grouping strategies. The work is supported by the implementation of: 1) a prototype for a system integrating biological data sources, called BioTRIFU, 2) algorithms for ontology alignment, and 3) an environment for evaluating strategies for similarity-based grouping of biological data, called KitEGA.
APA, Harvard, Vancouver, ISO, and other styles
6

Peralta, Veronika. "Data Quality Evaluation in Data Integration Systems." Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2006. http://tel.archives-ouvertes.fr/tel-00325139.

Full text
Abstract:
Les besoins d'accéder, de façon uniforme, à des sources de données multiples, sont chaque jour plus forts, particulièrement, dans les systèmes décisionnels qui ont besoin d'une analyse compréhensive des données. Avec le développement des Systèmes d'Intégration de Données (SID), la qualité de l'information est devenue une propriété de premier niveau de plus en plus exigée par les utilisateurs. Cette thèse porte sur la qualité des données dans les SID. Nous nous intéressons, plus précisément, aux problèmes de l'évaluation de la qualité des données délivrées aux utilisateurs en réponse à leurs requêtes et de la satisfaction des exigences des utilisateurs en terme de qualité. Nous analysons également l'utilisation de mesures de qualité pour l'amélioration de la conception du SID et de la qualité des données. Notre approche consiste à étudier un facteur de qualité à la fois, en analysant sa relation avec le SID, en proposant des techniques pour son évaluation et en proposant des actions pour son amélioration. Parmi les facteurs de qualité qui ont été proposés, cette thèse analyse deux facteurs de qualité : la fraîcheur et l'exactitude des données. Nous analysons les différentes définitions et mesures qui ont été proposées pour la fraîcheur et l'exactitude des données et nous faisons émerger les propriétés du SID qui ont un impact important sur leur évaluation. Nous résumons l'analyse de chaque facteur par le biais d'une taxonomie, qui sert à comparer les travaux existants et à faire ressortir les problèmes ouverts. Nous proposons un canevas qui modélise les différents éléments liés à l'évaluation de la qualité tels que les sources de données, les requêtes utilisateur, les processus d'intégration du SID, les propriétés du SID, les mesures de qualité et les algorithmes d'évaluation de la qualité. En particulier, nous modélisons les processus d'intégration du SID comme des processus de workflow, dans lesquels les activités réalisent les tâches qui extraient, intègrent et envoient des données aux utilisateurs. Notre support de raisonnement pour l'évaluation de la qualité est un graphe acyclique dirigé, appelé graphe de qualité, qui a la même structure du SID et contient, comme étiquettes, les propriétés du SID qui sont relevants pour l'évaluation de la qualité. Nous développons des algorithmes d'évaluation qui prennent en entrée les valeurs de qualité des données sources et les propriétés du SID, et, combinent ces valeurs pour qualifier les données délivrées par le SID. Ils se basent sur la représentation en forme de graphe et combinent les valeurs des propriétés en traversant le graphe. Les algorithmes d'évaluation peuvent être spécialisés pour tenir compte des propriétés qui influent la qualité dans une application concrète. L'idée derrière le canevas est de définir un contexte flexible qui permet la spécialisation des algorithmes d'évaluation à des scénarios d'application spécifiques. Les valeurs de qualité obtenues pendant l'évaluation sont comparées à celles attendues par les utilisateurs. Des actions d'amélioration peuvent se réaliser si les exigences de qualité ne sont pas satisfaites. Nous suggérons des actions d'amélioration élémentaires qui peuvent être composées pour améliorer la qualité dans un SID concret. Notre approche pour améliorer la fraîcheur des données consiste à l'analyse du SID à différents niveaux d'abstraction, de façon à identifier ses points critiques et cibler l'application d'actions d'amélioration sur ces points-là. Notre approche pour améliorer l'exactitude des données consiste à partitionner les résultats des requêtes en portions (certains attributs, certaines tuples) ayant une exactitude homogène. Cela permet aux applications utilisateur de visualiser seulement les données les plus exactes, de filtrer les données ne satisfaisant pas les exigences d'exactitude ou de visualiser les données par tranche selon leur exactitude. Comparée aux approches existantes de sélection de sources, notre proposition permet de sélectionner les portions les plus exactes au lieu de filtrer des sources entières. Les contributions principales de cette thèse sont : (1) une analyse détaillée des facteurs de qualité fraîcheur et exactitude ; (2) la proposition de techniques et algorithmes pour l'évaluation et l'amélioration de la fraîcheur et l'exactitude des données ; et (3) un prototype d'évaluation de la qualité utilisable dans la conception de SID.
APA, Harvard, Vancouver, ISO, and other styles
7

Peralta, Costabel Veronika del Carmen. "Data quality evaluation in data integration systems." Versailles-St Quentin en Yvelines, 2006. http://www.theses.fr/2006VERS0020.

Full text
Abstract:
This thesis deals with data quality evaluation in Data Integration Systems (DIS). Specifically, we address the problems of evaluating the quality of the data conveyed to users in response to their queries and verifying if users’ quality expectations can be achieved. We also analyze how quality measures can be used for improving the DIS and enforcing data quality. Our approach consists in studying one quality factor at a time, analyzing its impact within a DIS, proposing techniques for its evaluation and proposing improvement actions for its enforcement. Among the quality factors that have been proposed, this thesis analyzes two of the most used ones: data freshness and data accuracy
Cette thèse porte sur la qualité des données dans les Systèmes d’Intégration de Données (SID). Nous nous intéressons, plus précisément, aux problèmes de l’évaluation de la qualité des données délivrées aux utilisateurs en réponse à leurs requêtes et de la satisfaction des exigences des utilisateurs en terme de qualité. Nous analysons également l’utilisation de mesures de qualité pour l’amélioration de la conception du SID et la conséquente amélioration de la qualité des données. Notre approche consiste à étudier un facteur de qualité à la fois, en analysant sa relation avec le SID, en proposant des techniques pour son évaluation et en proposant des actions pour son amélioration. Parmi les facteurs de qualité qui ont été proposés, cette thèse analyse deux facteurs de qualité : la fraîcheur et l’exactitude des données
APA, Harvard, Vancouver, ISO, and other styles
8

Neumaier, Sebastian, Axel Polleres, Simon Steyskal, and Jürgen Umbrich. "Data Integration for Open Data on the Web." Springer International Publishing AG, 2017. http://dx.doi.org/10.1007/978-3-319-61033-7_1.

Full text
Abstract:
In this lecture we will discuss and introduce challenges of integrating openly available Web data and how to solve them. Firstly, while we will address this topic from the viewpoint of Semantic Web research, not all data is readily available as RDF or Linked Data, so we will give an introduction to different data formats prevalent on the Web, namely, standard formats for publishing and exchanging tabular, tree-shaped, and graph data. Secondly, not all Open Data is really completely open, so we will discuss and address issues around licences, terms of usage associated with Open Data, as well as documentation of data provenance. Thirdly, we will discuss issues connected with (meta-)data quality issues associated with Open Data on the Web and how Semantic Web techniques and vocabularies can be used to describe and remedy them. Fourth, we will address issues about searchability and integration of Open Data and discuss in how far semantic search can help to overcome these. We close with briefly summarizing further issues not covered explicitly herein, such as multi-linguality, temporal aspects (archiving, evolution, temporal querying), as well as how/whether OWL and RDFS reasoning on top of integrated open data could be help.
APA, Harvard, Vancouver, ISO, and other styles
9

Cheng, Hui. "Data integration and visualization for systems biology data." Diss., Virginia Tech, 2010. http://hdl.handle.net/10919/77250.

Full text
Abstract:
Systems biology aims to understand cellular behavior in terms of the spatiotemporal interactions among cellular components, such as genes, proteins and metabolites. Comprehensive visualization tools for exploring multivariate data are needed to gain insight into the physiological processes reflected in these molecular profiles. Data fusion methods are required to integratively study high-throughput transcriptomics, metabolomics and proteomics data combined before systems biology can live up to its potential. In this work I explored mathematical and statistical methods and visualization tools to resolve the prominent issues in the nature of systems biology data fusion and to gain insight into these comprehensive data. In order to choose and apply multivariate methods, it is important to know the distribution of the experimental data. Chi square Q-Q plot and violin plot were applied to all M. truncatula data and V. vinifera data, and found most distributions are right-skewed (Chapter 2). The biplot display provides an effective tool for reducing the dimensionality of the systems biological data and displaying the molecules and time points jointly on the same plot. Biplot of M. truncatula data revealed the overall system behavior, including unidentified compounds of interest and the dynamics of the highly responsive molecules (Chapter 3). The phase spectrum computed from the Fast Fourier transform of the time course data has been found to play more important roles than amplitude in the signal reconstruction. Phase spectrum analyses on in silico data created with two artificial biochemical networks, the Claytor model and the AB2 model proved that phase spectrum is indeed an effective tool in system biological data fusion despite the data heterogeneity (Chapter 4). The difference between data integration and data fusion are further discussed. Biplot analysis of scaled data were applied to integrate transcriptome, metabolome and proteome data from the V. vinifera project. Phase spectrum combined with k-means clustering was used in integrative analyses of transcriptome and metabolome of the M. truncatula yeast elicitation data and of transcriptome, metabolome and proteome of V. vinifera salinity stress data. The phase spectrum analysis was compared with the biplot display as effective tools in data fusion (Chapter 5). The results suggest that phase spectrum may perform better than the biplot. This work was funded by the National Science Foundation Plant Genome Program, grant DBI-0109732, and by the Virginia Bioinformatics Institute.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
10

Hackl, Peter, and Michaela Denk. "Data Integration: Techniques and Evaluation." Austrian Statistical Society, 2004. http://epub.wu.ac.at/5631/1/435%2D1317%2D1%2DSM.pdf.

Full text
Abstract:
Within the DIECOFIS framework, ec3, the Division of Business Statistics from the Vienna University of Economics and Business Administration and ISTAT worked together to find methods to create a comprehensive database of enterprise data required for taxation microsimulations via integration of existing disparate enterprise data sources. This paper provides an overview of the broad spectrum of investigated methodology (including exact and statistical matching as well as imputation) and related statistical quality indicators, and emphasises the relevance of data integration, especially for official statistics, as a means of using available information more efficiently and improving the quality of a statistical agency's products. Finally, an outlook on an empirical study comparing different exact matching procedures in the maintenance of Statistics Austria's Business Register is presented.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Data integration"

1

Genesereth, Michael. Data Integration. Cham: Springer International Publishing, 2010. http://dx.doi.org/10.1007/978-3-031-01550-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Dyché, Jill, and Evan Levy, eds. Customer Data Integration. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2012. http://dx.doi.org/10.1002/9781119202127.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Majkić, Zoran. Big Data Integration Theory. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-04156-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Doan, AnHai. Principles of data integration. Waltham, MA: Morgan Kaufmann, 2012.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
5

Goldfedder, Jarrett. Building a Data Integration Team. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-5653-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Davino, Cristina, and Luigi Fabbris, eds. Survey Data Collection and Integration. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-21308-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Viola de Azevedo Cunha, Mario. Market Integration Through Data Protection. Dordrecht: Springer Netherlands, 2013. http://dx.doi.org/10.1007/978-94-007-6085-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kerr, W. Scott. Data integration using virtual repositories. [Toronto]: Kerr, 1999.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ning, Kang, ed. Methodologies of Multi-Omics Data Integration and Data Mining. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-19-8210-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Office, United States Department of the Interior Bureau of Land Management ALMRS Project. Bureau of Land Management data integration. [Denver, Colo: Bureau of Land Management, ALMRS Project Office, 1985.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Data integration"

1

Genesereth, Michael. "Basic Concepts." In Data Integration, 19–29. Cham: Springer International Publishing, 2010. http://dx.doi.org/10.1007/978-3-031-01550-2_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Genesereth, Michael. "Introduction." In Data Integration, 1–17. Cham: Springer International Publishing, 2010. http://dx.doi.org/10.1007/978-3-031-01550-2_1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Genesereth, Michael. "Query Folding." In Data Integration, 31–48. Cham: Springer International Publishing, 2010. http://dx.doi.org/10.1007/978-3-031-01550-2_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Genesereth, Michael. "Query Planning." In Data Integration, 49–58. Cham: Springer International Publishing, 2010. http://dx.doi.org/10.1007/978-3-031-01550-2_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Genesereth, Michael. "Master Schema Management." In Data Integration, 59–65. Cham: Springer International Publishing, 2010. http://dx.doi.org/10.1007/978-3-031-01550-2_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Curtis, Bobby. "Data Integration." In Pro Oracle GoldenGate for the DBA, 217–24. Berkeley, CA: Apress, 2016. http://dx.doi.org/10.1007/978-1-4842-1179-3_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Revesz, Peter. "Data Integration." In Texts in Computer Science, 417–34. London: Springer London, 2009. http://dx.doi.org/10.1007/978-1-84996-095-3_17.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Shekhar, Shashi, and Hui Xiong. "Data Integration." In Encyclopedia of GIS, 215. Boston, MA: Springer US, 2008. http://dx.doi.org/10.1007/978-0-387-35973-1_244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Fait, Aaron, and Alisdair R. Fernie. "Data Integration." In Plant Metabolic Networks, 151–71. New York, NY: Springer New York, 2008. http://dx.doi.org/10.1007/978-0-387-78745-9_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bergamaschi, Sonia, Domenico Beneventano, Francesco Guerra, and Mirko Orsini. "Data Integration." In Handbook of Conceptual Modeling, 441–76. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-15865-0_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Data integration"

1

Mork, P., A. Rosenthal, J. Korb, and K. Samuel. "Integration Workbench: Integrating Schema Integration Tools." In 22nd International Conference on Data Engineering Workshops (ICDEW'06). IEEE, 2006. http://dx.doi.org/10.1109/icdew.2006.69.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Lenzerini, Maurizio. "Data integration." In the twenty-first ACM SIGMOD-SIGACT-SIGART symposium. New York, New York, USA: ACM Press, 2002. http://dx.doi.org/10.1145/543613.543644.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Golshan, Behzad, Alon Halevy, George Mihaila, and Wang-Chiew Tan. "Data Integration." In SIGMOD/PODS'17: International Conference on Management of Data. New York, NY, USA: ACM, 2017. http://dx.doi.org/10.1145/3034786.3056124.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Saito, Toru, and Jinsong Ouyang. "Client-side data visualization." In Integration (IRI). IEEE, 2009. http://dx.doi.org/10.1109/iri.2009.5211550.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Sheng, Hao, Huajun Chen, Tong Yu, and Yelei Feng. "Linked data based semantic similarity and data mining." In Integration (2010 IRI). IEEE, 2010. http://dx.doi.org/10.1109/iri.2010.5558957.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Dong, X. L., and D. Srivastava. "Big data integration." In 2013 29th IEEE International Conference on Data Engineering (ICDE 2013). IEEE, 2013. http://dx.doi.org/10.1109/icde.2013.6544914.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ala'i, Riaz. "Borehole data integration." In SEG Technical Program Expanded Abstracts 1998. Society of Exploration Geophysicists, 1998. http://dx.doi.org/10.1190/1.1820493.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Tan, Wang-Chiew. "Deep Data Integration." In SIGMOD/PODS '21: International Conference on Management of Data. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3448016.3460534.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Cudre-Mauroux, Philippe. "Big Data Integration." In 2017 14th International Conference on Telecommunications (ConTEL). IEEE, 2017. http://dx.doi.org/10.23919/contel.2017.8000011.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Mirza, Ghulam Ali. "Value name conflict while integrating data indatabase integration." In 2014 11th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). IEEE, 2014. http://dx.doi.org/10.1109/iccwamtip.2014.7073417.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Data integration"

1

Obaton, Dominique, and Cécile Liétard. EuroSea data integration. EuroSea, 2023. http://dx.doi.org/10.3289/eurosea_d3.17.

Full text
Abstract:
This D3.17 “Data Integration” deliverable has been written in complementarity of the deliverables D3.13 “Data handbook” and D3.7 “Networks harmonisation recommendations”. It has been primarily written with the aim to be useful for users, looking for in situ data or datasets, in their choice of data infrastructures (CMEMS - Copernicus Marine Environment Monitoring Service – EMODnet – European Marine Observation and Data network - and SeaDataNet) best suited to their needs. To start, this deliverable provides a description of these three major European data integrators and explains how to access to the data and what type of data it is possible to find. The cooperation between these three data infrastructures is also presented. A recommendation about what type of metadata should be attached to the measurement is also included in this deliverable. Its objective is to encourage data infrastructures to harmonize their metadata, which would allow data marine users to switch more easily from one infrastructure to another one and thus extend access to more data. This deliverable also presents two case studies, in which we put ourselves in the place of a in situ marine data user. (EuroSea deliverable D3.17)
APA, Harvard, Vancouver, ISO, and other styles
2

Critchlow, T., B. Ludaescher, M. Vouk, and C. Pu. Distributed Data Integration Infrastructure. Office of Scientific and Technical Information (OSTI), February 2003. http://dx.doi.org/10.2172/15003342.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Nevarez, Rene. Data Brief—Government Integration. Population Council, July 2023. http://dx.doi.org/10.31899/sbsr2023.1007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Critchlow, T. J., L. Liu, C. Pu, A. Gupta, B. Ludaescher, I. Altintas, M. Vouk, D. Bitzer, M. Singh, and D. Rosnick. Scientific Data Management Center Scientific Data Integration. Office of Scientific and Technical Information (OSTI), January 2003. http://dx.doi.org/10.2172/15003250.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Bray, O. H. Information integration for data fusion. Office of Scientific and Technical Information (OSTI), January 1997. http://dx.doi.org/10.2172/444047.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Miller, R. Allen. Castability Assessment and Data Integration. Office of Scientific and Technical Information (OSTI), March 2005. http://dx.doi.org/10.2172/859291.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Sturdy, James T. Military Data Link Integration Application. Fort Belvoir, VA: Defense Technical Information Center, June 2004. http://dx.doi.org/10.21236/ada465745.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Musick, R., T. Critchlow, M. Ganesh, Z. Fidelis, A. Zemla, and T. Slezak. Data Foundry: Data Warehousing and Integration for Scientific Data Management. Office of Scientific and Technical Information (OSTI), February 2000. http://dx.doi.org/10.2172/793555.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Swinhoe, Martyn Thomas. Model development and data uncertainty integration. Office of Scientific and Technical Information (OSTI), December 2015. http://dx.doi.org/10.2172/1227409.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Swinhoe, Martyn Thomas. Model development and data uncertainty integration. Office of Scientific and Technical Information (OSTI), December 2015. http://dx.doi.org/10.2172/1227933.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography