Dissertations / Theses: 'Heterogeneous databases'

1

Hu, Jian. "Interoperability of heterogeneous medical databases." Thesis, Heriot-Watt University, 1994. http://hdl.handle.net/10399/1358.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

MacKinnon, Lachlan Mhor. "Intelligent query manipulation for heterogeneous databases." Thesis, Heriot-Watt University, 1998. http://hdl.handle.net/10399/1252.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Bhasker, Bharat. "Query processing in heterogeneous distributed database management systems." Diss., Virginia Tech, 1992. http://hdl.handle.net/10919/39437.

Full text

Abstract:

The goal of this work is to present an advanced query processing algorithm formulated and developed in support of heterogeneous distributed database management systems. Heterogeneous distributed database management systems view the integrated data through an uniform global schema. The query processing algorithm described here produces an inexpensive strategy for a query expressed over the global schema. The research addresses the following aspects of query processing: (1) Formulation of a low level query language to express the fundamental heterogeneous database operations; (2) Translation of the query expressed over the global schema to an equivalent query expressed over a conceptual schema; (3) An estimation methodology to derive the intermediate result sizes of the database operations; (4) A query decomposition algorithm to generate an efficient sequence of the basic database operations to answer the query. This research addressed the first issue by developing an algebraic query language called cluster algebra. The cluster algebra consists of the following operations: (a) Selection, union, intersection and difference, which are extensions of their relational algebraic counterparts to heterogeneous databases; (b) Normal-join and normal-projection which replace their counterparts, join and projection, in the relational algebra; (c) Two new operators embed and unembed to restructure the database schema. The second issue of the query translation was addressed by development of an algorithm that translates a cluster algebra query expressed over the virtual views to an equivalent cluster algebra query expressed over the conceptual databases. A non-parametric estimation methodology to estimate the result size of a cluster algebra operation was developed to address the third issue described above. Finally, this research developed a query decomposition algorithm, applicable to the relational and non-relational databases, that decomposes a query by computing all profitable semi-join operations, followed by the determination of the best sequence of join operations per processing site. The join optimization is performed by formulating a zero-one integer linear program that uses the non-parametric estimation technique to compute the sizes of intermediate results. The query processing algorithm was implemented in the context of DAVID, a heterogeneous distributed database management system.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

4

Xu, Lihui. "On the integration of heterogeneous deductive databases." Thesis, King's College London (University of London), 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.321953.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Bilander, Jesper. "Transferring heterogeneous data from generic databases into a SQL database using HTTPPossibilities and Implementation." Thesis, Umeå universitet, Institutionen för datavetenskap, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-136466.

Full text

Abstract:

The thesis investigates the possibility to create software that can synchronize heterogeneous data from generic databases through the Internet. To further enhance the conclusion, we implement a prototype that fetches data from numerous source system databases, and stores the data in another location, more accessible for processing. To achieve that, previous work with topics related to the area was inspected and analyzed. Based on the investigation, the implemented prototype uses a client-server approach which communicates with a REST-like design using JSON-strings and HTTP. This thesis and the resulting prototype proves that it is fully possible to create such software by combining and using existing protocols and frameworks.i

APA, Harvard, Vancouver, ISO, and other styles

6

Dhamija, Dinesh. "Synchronization of information in multiple heterogeneous manufacturing databases." Ohio : Ohio University, 1999. http://www.ohiolink.edu/etd/view.cgi?ohiou1175267226.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Srinivasan, Uma Computer Science &amp Engineering Faculty of Engineering UNSW. "A FRAMEWORK FOR CONCEPTUAL INTEGRATION OF HETEROGENEOUS DATABASES." Awarded by:University of New South Wales. School of Computer Science and Engineering, 1997. http://handle.unsw.edu.au/1959.4/33463.

Full text

Abstract:

Autonomy of operations combined with decentralised management of data has given rise to a number of heterogeneous databases or information systems within an enterprise. These systems are often incompatible in structure as well as content and hence difficult to integrate. This thesis investigates the problem of heterogeneous database integration, in order to meet the increasing demand for obtaining meaningful information from multiple databases without disturbing local autonomy. In spite of heterogeneity, the unity of overall purpose within a common application domain, nevertheless, provides a degree of semantic similarity which manifests itself in the form of similar data structures and common usage patterns of existing information systems. This work introduces a conceptual integration approach that exploits the similarity in meta level information in existing systems and performs metadata mining on database objects to discover a set of concepts common to heterogeneous databases within the same application domain. The conceptual integration approach proposed here utilises the background knowledge available in database structures and usage patterns and generates a set of concepts that serve as a domain abstraction and provide a conceptual layer above existing legacy systems. This conceptual layer is further utilised by an information re-engineering framework that customises and packages information to reflect the unique needs of different user groups within the application domain. The architecture of the information re-engineering framework is based on an object-oriented model that represents the discovered concepts as customised application objects for each distinct user group.

APA, Harvard, Vancouver, ISO, and other styles

8

Srinivasan, Uma. "A framwork for conceptual integration of heterogeneous databases." [Sydney : University of New South Wales], 1997. http://www.library.unsw.edu.au/%7Ethesis/adt-NUN/public/adt-NUN1998.0002/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Hansen, David Marshall. "An object oriented heterogeneous database architecture /." Full text open access at:, 1995. http://content.ohsu.edu/u?/etd,196.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

El, Khatib Hazem Turki. "Integrating information from heterogeneous databases using agents and metadata." Thesis, Heriot-Watt University, 2000. http://hdl.handle.net/10399/1214.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Raman, Pirabhu. "GEMS Gossip-Enabled Monitoring Service for heterogeneous distributed systems /." [Gainesville, Fla.] : University of Florida, 2002. http://purl.fcla.edu/fcla/etd/UFE0000598.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Chen, Liangyou. "Ad hoc integration and querying of heterogeneous online distributed databases." Diss., Mississippi State : Mississippi State University, 2004. http://library.msstate.edu/etd/show.asp?etd=etd-07092004-103428.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Athauda, Rukshan Indika. "Integration and querying of heterogeneous, autonomous, distributed database systems." FIU Digital Commons, 2000. http://digitalcommons.fiu.edu/etd/1332.

Full text

Abstract:

Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.

APA, Harvard, Vancouver, ISO, and other styles

14

O'Neill, Thomas Edward Prell Milton John. "A methodology for the integration multiple distributed heterogeneous databases : application to databases of the Tomahawk engineering community /." Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1995. http://handle.dtic.mil/100.2/ADA304846.

Full text

Abstract:

Thesis (M.S. in Information Technology Management) Naval Postgraduate School, September 1995.
Thesis advisor(s): Magdi Kamel, Martin J. McCaffrey. "September 1995." Bibliography: p. 159. Also available online.

APA, Harvard, Vancouver, ISO, and other styles

15

O'Neill, Thomas Edward, and Milton John Prell. "A methodology for the integration multiple distributed heterogeneous databases: application to databases of the Tomahawk engineering community." Thesis, Monterey, California. Naval Postgraduate School, 1995. http://hdl.handle.net/10945/35177.

Full text

Abstract:

Engineering 2000 is a project initiated by Naval Surface Warfare Center (NSWC) and designed to develop an infrastructure for the sharing of engineering data and information to support the current and future needs of the Tomahawk engineering support community. The goal of Engineering 2000 is to integrate the engineering, logistics, and management automated tools of the Tomahawk community into a single infrastructure. This thesis investigates the integration of Tomahawk distributed heterogeneous databases. It develops a six- step methodology for identifying an integrated strategy and architecture for heterogenous databases in a distributed environment and apply it to the two most significant databases used by the Tomahawk engineering community, Tomahawk Information Management Engineering System (TIMES) and Tomahawk Engineering Exchange Network (TEXN). The application of the methodology to these databases suggests a loosely coupled architecture for integrating their data. (MM)

APA, Harvard, Vancouver, ISO, and other styles

16

Halle, Robert F. "Extensible Markup Language (XML) based analysis and comparison of heterogeneous databases." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2001. http://handle.dtic.mil/100.2/ADA393736.

Full text

Abstract:

Thesis (M.S. in Software Engineering) Naval Postgraduate School, June 2001.
Thesis advisor(s): Berzins, Valdis. "June 2001." Includes bibliographical references (p. 137-138). Also Available online.

APA, Harvard, Vancouver, ISO, and other styles

17

Camargo, Renata da Silva. "Mining the evolutionary and functional relationships of proteins in heterogeneous biological databases." Thesis, University of Sheffield, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.412775.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Koeller, Andreas. "Integration of heterogeneous databases : discovery of meta-information and maintenance of schema-restructuring views." Link to electronic version, 2001. http://www.wpi.edu/Pubs/ETD/Available/etd-0415102-133008/.

Full text

Abstract:

Thesis (Ph. D.)--Worcester Polytechnic Institute.
UMI no. 30-30945. Keywords: schema restructuring; schema changes; meta-data discovery; data mining; data integration. Includes bibliographical references (leaves 256-274).

APA, Harvard, Vancouver, ISO, and other styles

19

Koeller, Andreas. "Integration of Heterogeneous Databases: Discovery of Meta-Information and Maintenance of Schema-Restructuring Views." Digital WPI, 2002. https://digitalcommons.wpi.edu/etd-dissertations/116.

Full text

Abstract:

In today's networked world, information is widely distributed across many independent databases in heterogeneous formats. Integrating such information is a difficult task and has been adressed by several projects. However, previous integration solutions, such as the EVE-Project, have several shortcomings. Database contents and structure change frequently, and users often have incomplete information about the data content and structure of the databases they use. When information from several such insufficiently described sources is to be extracted and integrated, two problems have to be solved: How can we discover the structure and contents of and interrelationships among unknown databases, and how can we provide durable integration views over several such databases? In this dissertation, we have developed solutions for those key problems in information integration. The first part of the dissertation addresses the fact that knowledge about the interrelationships between databases is essential for any attempt at solving the information integration problem. We are presenting an algorithm called FIND2 based on the clique-finding problem in graphs and k-uniform hypergraphs to discover redundancy relationships between two relations. Furthermore, the algorithm is enhanced by heuristics that significantly reduce the search space when necessary. Extensive experimental studies on the algorithm both with and without heuristics illustrate its effectiveness on a variety of real-world data sets. The second part of the dissertation addresses the durable view problem and presents the first algorithm for incremental view maintenance in schema-restructuring views. Such views are essential for the integration of heterogeneous databases. They are typically defined in schema-restructuring query languages like SchemaSQL, which can transform schema into data and vice versa, making traditional view maintenance based on differential queries impossible. Based on an existing algebra for SchemaSQL, we present an update propagation algorithm that propagates updates along the query algebra tree and prove its correctness. We also propose optimizations on our algorithm and present experimental results showing its benefits over view recomputation.

APA, Harvard, Vancouver, ISO, and other styles

20

Sanz, Blasco Ismael. "Flexible techniques for heterogeneous XML data retrieval." Doctoral thesis, Universitat Jaume I, 2007. http://hdl.handle.net/10803/10373.

Full text

Abstract:

The progressive adoption of XML by new communities of users has motivated the appearance of applications that require the management of large and complex collections, which present a large amount of heterogeneity. Some relevant examples are present in the fields of bioinformatics, cultural heritage, ontology management and geographic information systems, where heterogeneity is not only reflected in the textual content of documents, but also in the presence of rich structures which cannot be properly accounted for using fixed schema definitions. Current approaches for dealing with heterogeneous XML data are, however, mainly focused at the content level, whereas at the structural level only a limited amount of heterogeneity is tolerated; for instance, weakening the parent-child relationship between nodes into the ancestor-descendant relationship.
The main objective of this thesis is devising new approaches for querying heterogeneous XML collections. This general objective has several implications: First, a collection can present different levels of heterogeneity in different granularity levels; this fact has a significant impact in the selection of specific approaches for handling, indexing and querying the collection. Therefore, several metrics are proposed for evaluating the level of heterogeneity at different levels, based on information-theoretical considerations. These metrics can be employed for characterizing collections, and clustering together those collections which present similar characteristics.
Second, the high structural variability implies that query techniques based on exact tree matching, such as the standard XPath and XQuery languages, are not suitable for heterogeneous XML collections. As a consequence, approximate querying techniques based on similarity measures must be adopted. Within the thesis, we present a formal framework for the creation of similarity measures which is based on a study of the literature that shows that most approaches for approximate XML retrieval (i) are highly tailored to very specific problems and (ii) use similarity measures for ranking that can be expressed as ad-hoc combinations of a set of --basic' measures. Some examples of these widely used measures are tf-idf for textual information and several variations of edit distances. Our approach wraps these basic measures into generic, parametrizable components that can be combined into complex measures by exploiting the composite pattern, commonly used in Software Engineering. This approach also allows us to integrate seamlessly highly specific measures, such as protein-oriented matching functions.
Finally, these measures are employed for the approximate retrieval of data in a context of highly structural heterogeneity, using a new approach based on the concepts of pattern and fragment. In our context, a pattern is a concise representations of the information needs of a user, and a fragment is a match of a pattern found in the database. A pattern consists of a set of tree-structured elements --- basically an XML subtree that is intended to be found in the database, but with a flexible semantics that is strongly dependent on a particular similarity measure. For example, depending on a particular measure, the particular hierarchy of elements, or the ordering of siblings, may or may not be deemed to be relevant when searching for occurrences in the database.
Fragment matching, as a query primitive, can deal with a much higher degree of flexibility than existing approaches. In this thesis we provide exhaustive and top-k query algorithms. In the latter case, we adopt an approach that does not require the similarity measure to be monotonic, as all previous XML top-k algorithms (usually based on Fagin's algorithm) do. We also presents two extensions which are important in practical settings: a specification for the integration of the aforementioned techniques into XQuery, and a clustering algorithm that is useful to manage complex result sets.
All of the algorithms have been implemented as part of ArHeX, a toolkit for the development of multi-similarity XML applications, which supports fragment-based queries through an extension of the XQuery language, and includes graphical tools for designing similarity measures and querying collections. We have used ArHeX to demonstrate the effectiveness of our approach using both synthetic and real data sets, in the context of a biomedical research project.

APA, Harvard, Vancouver, ISO, and other styles

21

Schroiff, Anna. "Using a Rule-System as Mediator for Heterogeneous Databases, exemplified in a Bioinformatics Use Case." Thesis, University of Skövde, School of Humanities and Informatics, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-975.

Full text

Abstract:

Databases nowadays used in all kinds of application areas often differ greatly in a number of properties. These varieties add complexity to the handling of databases, especially when two or more different databases are dependent.

The approach described here to propagate updates in an application scenario with heterogeneous, dependent databases is the use of a rule-based mediator. The system EruS (ECA rules updating SCOP) applies active database technologies in a bioinformatics scenario. Reactive behaviour based on rules is used for databases holding protein structures.

The inherent heterogeneities of the Structural Classification of Proteins (SCOP) database and the Protein Data Bank (PDB) cause inconsistencies in the SCOP data derived from PDB. This complicates research on protein structures.

EruS solves this problem by establishing rule-based interaction between the two databases. The system is built on the rule engine ruleCore with Event-Condition-Action rules to process PDB updates. It is complemented with wrappers accessing the databases to generate the events, which are executed as actions. The resulting system processes deletes and modifications of existing PDB entries and updates SCOP flatfiles with the relevant information. This is the first step in the development of EruS, which is to be extended in future work.

The project improves bioinformatics research by providing easy access to up-to-date information from PDB to SCOP users. The system can also be considered as a model for rule-based mediators in other application areas.

APA, Harvard, Vancouver, ISO, and other styles

22

Zhao, Huimin. "Combining schema and instance information for integrating heterogeneous databases: An analytical approach and empirical evaluation." Diss., The University of Arizona, 2002. http://hdl.handle.net/10150/280014.

Full text

Abstract:

Critical to semantic integration of heterogeneous data sources, determining the semantic correspondences among the data sources is a very complex and resource-consuming task and demands automated support. In this dissertation, we propose a comprehensive approach to detecting both schema-level and instance-level semantic correspondences from heterogeneous data sources. Semantic correspondences on the two levels are identified alternately and incrementally in an iterative procedure. Statistical cluster analysis methods and the Self-Organizing Map (SOM) neural network method are used first to identify similar schema elements (i.e., relations and attributes). Based on the identified schema-level correspondences, classification techniques drawn from statistical pattern recognition, machine learning, and artificial neural networks are then used to identify matching tuples. Multiple classifiers are combined in various ways, such as bagging, boosting, concatenating, and stacking, to improve classification accuracy. Statistical analysis techniques, such as correlation and regression, are then applied to a preliminary integrated data set to evaluate the relationships among schema elements more accurately. Improved schema-level correspondences are fed back into the identification of instance-level correspondences, resulting in a loop in the overall procedure. Empirical evaluation using real-world and simulated data that has been performed is described to demonstrate the utility of the proposed multi-level, multi-technique approach to detecting semantic correspondences from heterogeneous data sources.

APA, Harvard, Vancouver, ISO, and other styles

23

Venkataraman, Ramesh. "Utilizing integrity constraint knowledge in heterogeneous databases: A methodology for schema integration and semantic query processing." Diss., The University of Arizona, 1995. http://hdl.handle.net/10150/187242.

Full text

Abstract:

Information sharing among databases requires the development of techniques for accessing data from multiple heterogeneous databases. One approach to providing interoperability among these databases, is to define one or more schemas representing a coherent view of the underlying databases. A review of existing research on schema integration, the process of generating integrated schemas, points to the need for development of techniques for identifying objects in multiple databases that may be related. The development of efficient mechanisms for accessing heterogeneous databases is another issue that has received very little attention in the literature. This dissertation describes a seven step methodology for utilizing integrity constraint knowledge from multiple heterogeneous databases. The methodology extends traditional approaches to schema integration by proposing additional steps that describe how integrity constraints can be used, in a heterogeneous database environment, to improve the interschema relationship identification process and generate additional semantics, in the form of integrity constraints, at the integrated schema level. The dissertation introduces the concept of constraint-based relationships among objects in heterogeneous databases and describes how these relationships can be used to integrate integrity constraints specified on heterogeneous databases. The dissertation also elaborates on how these integrated integrity constraints can be used to facilitate semantic query processing in a heterogeneous database environment. The description of a system that implements the various phases of the methodology is also presented. A unique feature of the system is that it uses blackboard architectures to facilitate the human-computer interaction needed during schema integration. A simulation study that shows the potential benefits of performing semantic query processing, in a heterogeneous database environment, using the integrated integrity constraints generated by our methodology is also presented.

APA, Harvard, Vancouver, ISO, and other styles

24

Hobro, Mark. "Semantic Integration across Heterogeneous Databases : Finding Data Correspondences using Agglomerative Hierarchical Clustering and Artificial Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226657.

Full text

Abstract:

The process of data integration is an important part of the database field when it comes to database migrations and the merging of data. The research in the area has grown with the addition of machine learning approaches in the last 20 years. Due to the complexity of the research field, no go-to solutions have appeared. Instead, a wide variety of ways of enhancing database migrations have emerged. This thesis examines how well a learning-based solution performs for the semantic integration problem in database migrations. Two algorithms are implemented. One that is based on information retrieval theory, with the goal of yielding a matching result that can be used as a benchmark for measuring the performance of the machine learning algorithm. The machine learning approach is based on grouping data with agglomerative hierarchical clustering and then training a neural network to recognize patterns in the data. This allows making predictions about potential data correspondences across two databases. The results show that agglomerative hierarchical clustering performs well in the task of grouping the data into classes. The classes can in turn be used for training a neural network. The matching algorithm gives a high recall of matching tables, but improvements are needed to both receive a high recall and precision. The conclusion is that the proposed learning-based approach, using agglomerative hierarchical clustering and a neural network, works as a solid base to semi-automate the data integration problem seen in this thesis. But the solution needs to be enhanced with scenario specific algorithms and rules, to reach desired performance.
Dataintegrering är en viktig del inom området databaser när det kommer till databasmigreringar och sammanslagning av data. Forskning inom området har ökat i takt med att maskininlärning blivit ett attraktivt tillvägagångssätt under de senaste 20 åren. På grund av komplexiteten av forskningsområdet, har inga optimala lösningar hittats. Istället har flera olika tekniker framställts, som tillsammans kan förbättra databasmigreringar. Denna avhandling undersöker hur bra en lösning baserad på maskininlärning presterar för dataintegreringsproblemet vid databasmigreringar. Två algoritmer har implementerats. En är baserad på informationssökningsteori, som främst används för att ha en prestandamässig utgångspunkt för algoritmen som är baserad på maskininlärning. Den algoritmen består av ett första steg, där data grupperas med hjälp av hierarkisk klustring. Sedan tränas ett artificiellt neuronnät att hitta mönster i dessa grupperingar, för att kunna göra förutsägelser huruvida olika datainstanser har ett samband mellan två databaser. Resultatet visar att agglomerativ hierarkisk klustring presterar väl i uppgiften att klassificera den data som använts. Resultatet av matchningsalgoritmen visar på att en stor mängd av de matchande tabellerna kan hittas. Men förbättringar behöver göras för att både ge hög en hög återkallelse av matchningar och hög precision för de matchningar som hittas. Slutsatsen är att ett inlärningsbaserat tillvägagångssätt, i detta fall att använda agglomerativ hierarkisk klustring och sedan träna ett artificiellt neuronnät, fungerar bra som en basis för att till viss del automatisera ett dataintegreringsproblem likt det som presenterats i denna avhandling. För att få bättre resultat, krävs att lösningen förbättras med mer situationsspecifika algoritmer och regler.

APA, Harvard, Vancouver, ISO, and other styles

25

Benssam, Ali. "Digital Cockpits and Decision Support Systems. Design of Technics and Tools to Extract and Process Data from Heterogeneous Databases." Thesis, Université Laval, 2006. http://www.theses.ulaval.ca/2006/23599/23599.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Zuo, Landong. "A semantic and agent-based approach to support information retrieval, interoperability and multi-lateral viewpoints for heterogeneous environmental databases." Thesis, Queen Mary, University of London, 2006. http://qmro.qmul.ac.uk/xmlui/handle/123456789/1770.

Full text

Abstract:

Data stored in individual autonomous databases often needs to be combined and interrelated. For example, in the Inland Water (IW) environment monitoring domain, the spatial and temporal variation of measurements of different water quality indicators stored in different databases are of interest. Data from multiple data sources is more complex to combine when there is a lack of metadata in a computation forin and when the syntax and semantics of the stored data models are heterogeneous. The main types of information retrieval (IR) requirements are query transparency and data harmonisation for data interoperability and support for multiple user views. A combined Semantic Web based and Agent based distributed system framework has been developed to support the above IR requirements. It has been implemented using the Jena ontology and JADE agent toolkits. The semantic part supports the interoperability of autonomous data sources by merging their intensional data, using a Global-As-View or GAV approach, into a global semantic model, represented in DAML+OIL and in OWL. This is used to mediate between different local database views. The agent part provides the semantic services to import, align and parse semantic metadata instances, to support data mediation and to reason about data mappings during alignment. The framework has applied to support information retrieval, interoperability and multi-lateral viewpoints for four European environmental agency databases. An extended GAV approach has been developed and applied to handle queries that can be reformulated over multiple user views of the stored data. This allows users to retrieve data in a conceptualisation that is better suited to them rather than to have to understand the entire detailed global view conceptualisation. User viewpoints are derived from the global ontology or existing viewpoints of it. This has the advantage that it reduces the number of potential conceptualisations and their associated mappings to be more computationally manageable. Whereas an ad hoc framework based upon conventional distributed programming language and a rule framework could be used to support user views and adaptation to user views, a more formal framework has the benefit in that it can support reasoning about the consistency, equivalence, containment and conflict resolution when traversing data models. A preliminary formulation of the formal model has been undertaken and is based upon extending a Datalog type algebra with hierarchical, attribute and instance value operators. These operators can be applied to support compositional mapping and consistency checking of data views. The multiple viewpoint system was implemented as a Java-based application consisting of two sub-systems, one for viewpoint adaptation and management, the other for query processing and query result adjustment.

APA, Harvard, Vancouver, ISO, and other styles

27

Vilsmaier, Christian. "Contextualized access to distributed and heterogeneous multimedia data sources." Thesis, Lyon, INSA, 2014. http://www.theses.fr/2014ISAL0094/document.

Full text

Abstract:

Rendre les données multimédias disponibles en ligne devient moins cher et plus pratique sur une base quotidienne, par exemple par les utilisateurs eux-mêmes. Des phénomènes du Web comme Facebook, Twitter et Flickr bénéficient de cette évolution. Ces phénomènes et leur acceptation accrue conduisent à une multiplication du nombre d’images disponibles en ligne. La taille cumulée de ces images souvent publiques et donc consultables, est de l’ordre de plusieurs zettaoctets. L’exécution d’une requête de similarité sur de tels volumes est un défi que la communauté scientifique commence à cibler. Une approche envisagée pour faire face à ce problème propose d’utiliser un système distribué et hétérogène de recherche d’images basé sur leur contenu (CBIRs). De nombreux problèmes émergent d’un tel scénario. Un exemple est l’utilisation de formats de métadonnées distincts pour décrire le contenu des images; un autre exemple est l’information technique et structurelle inégale. Les métriques individuelles qui sont utilisées par les CBIRs pour calculer la similarité entre les images constituent un autre exemple. Le calcul de bons résultats dans ce contexte s’avère ainsi une tàche très laborieuse qui n’est pas encore scientifiquement résolue. Le problème principalement abordé dans cette thèse est la recherche de photos de CBIRs similaires à une image donnée comme réponse à une requête multimédia distribuée. La contribution principale de cette thèse est la construction d’un réseau de CBIRs sensible à la sémantique des contenus (CBIRn). Ce CBIRn sémantique est capable de collecter et fusionner les résultats issus de sources externes spécialisées. Afin d’être en mesure d’intégrer de telles sources extérieures, prêtes à rejoindre le réseau, mais pas à divulguer leur configuration, un algorithme a été développé capable d’estimer la configuration d’un CBIRS. En classant les CBIRs et en analysant les requêtes entrantes, les requêtes d’image sont exclusivement transmises aux CBIRs les plus appropriés. De cette fac ̧on, les images sans intérêt pour l’utilisateur peuvent être omises à l’avance. Les images retournées cells sont considérées comme similaires par rapport à l’image donnée pour la requête. La faisabilité de l’approche et l’amélioration obtenue par le processus de recherche sont démontrées par un développement prototypique et son évaluation utilisant des images d’ImageNet. Le nombre d’images pertinentes renvoyées par l’approche de cette thèse en réponse à une requête image est supérieur d’un facteur 4.75 par rapport au résultat obtenu par un réseau de CBIRs predéfini
Making multimedia data available online becomes less expensive and more convenient on a daily basis. This development promotes web phenomenons such as Facebook, Twitter, and Flickr. These phenomena and their increased acceptance in society in turn leads to a multiplication of the amount of available images online. This vast amount of, frequently public and therefore searchable, images already exceeds the zettabyte bound. Executing a similarity search on the magnitude of images that are publicly available and receiving a top quality result is a challenge that the scientific community has recently attempted to rise to. One approach to cope with this problem assumes the use of distributed heterogeneous Content Based Image Retrieval system (CBIRs). Following from this anticipation, the problems that emerge from a distributed query scenario must be dealt with. For example the involved CBIRs’ usage of distinct metadata formats for describing their content, as well as their unequal technical and structural information. An addition issue is the individual metrics that are used by the CBIRs to calculate the similarity between pictures, as well as their specific way of being combined. Overall, receiving good results in this environment is a very labor intensive task which has been scientifically but not yet comprehensively explored. The problem primarily addressed in this work is the collection of pictures from CBIRs, that are similar to a given picture, as a response to a distributed multimedia query. The main contribution of this thesis is the construction of a network of Content Based Image Retrieval systems that are able to extract and exploit the information about an input image’s semantic concept. This so called semantic CBIRn is mainly composed of CBIRs that are configured by the semantic CBIRn itself. Complementarily, there is a possibility that allows the integration of specialized external sources. The semantic CBIRn is able to collect and merge results of all of these attached CBIRs. In order to be able to integrate external sources that are willing to join the network, but are not willing to disclose their configuration, an algorithm was developed that approximates these configurations. By categorizing existing as well as external CBIRs and analyzing incoming queries, image queries are exclusively forwarded to the most suitable CBIRs. In this way, images that are not of any use for the user can be omitted beforehand. The hereafter returned images are rendered comparable in order to be able to merge them to one single result list of images, that are similar to the input image. The feasibility of the approach and the hereby obtained improvement of the search process is demonstrated by a prototypical implementation. Using this prototypical implementation an augmentation of the number of returned images that are of the same semantic concept as the input images is achieved by a factor of 4.75 with respect to a predefined non-semantic CBIRn

APA, Harvard, Vancouver, ISO, and other styles

28

Hina, David Rex. "Evaluation of the Extensible Markup Language (XML) as a means for establishing interoperability between heterogeneous Department of Defense (DoD) databases." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2000. http://handle.dtic.mil/100.2/ADA384640.

Full text

Abstract:

Thesis (M.S. in Software Engineering) Naval Postgraduate School, Sept. 2000.
"September 2000." Thesis advisor(s): Berzins, Valdis. Includes bibliographical references (p. 103-105). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

29

Riaz, Muhammad Atif, and Sameer Munir. "An Instance based Approach to Find the Types of Correspondence between the Attributes of Heterogeneous Datasets." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-1938.

Full text

Abstract:

Context: Determining attribute correspondence is the most important, time consuming and knowledge intensive part during databases integration. It is also used in other data manipulation applications such as data warehousing, data design, semantic web and e-commerce. Objectives: In this thesis the aim is to investigate how to find the types of correspondence between the attributes of heterogeneous datasets when schema design information of the data sets is unknown. Methods: A literature review was conducted to extract the knowledge related to the approaches that are used to find the correspondence between the attributes of heterogeneous datasets. Extracted knowledge from the literature review is used in developing an instance based approach for finding types of correspondence between the attributes of heterogeneous datasets when schema design information is unknown. To validate the proposed approach an experiment was conducted in the real environment using the data provided by the Telecom Industry (Ericsson) Karlskrona. Evaluation of the results was carried using the well known and mostly used measures from information retrieval field precision, recall and F-measure. Results: To find the types of correspondence between the attributes of heterogeneous datasets, good results depend on the ability of the algorithm to avoid the unmatched pairs of rows during the Row Similarity Phase. An evaluation of proposed approach is performed via experiments. We found 96.7% (average of three experiments) F-measure. Conclusions: The analysis showed that the proposed approach was feasible to be used and it provided users a mean to find the corresponding attributes and the types of correspondence between corresponding attributes, based on the information extracted from the similar pairs of rows from the heterogeneous data sets where their similarity based on the same common primary keys values.

APA, Harvard, Vancouver, ISO, and other styles

30

Xie, Wanxia. "Supporting Distributed Transaction Processing Over Mobile and Heterogeneous Platforms." Diss., Georgia Institute of Technology, 2005. http://hdl.handle.net/1853/14073.

Full text

Abstract:

Recent advances in pervasive computing and peer-to-peer computing have opened up vast opportunities for developing collaborative applications. To benefit from these emerging technologies, there is a need for investigating techniques and tools that will allow development and deployment of these applications on mobile and heterogeneous platforms. To meet these challenging tasks, we need to address the typical characteristics of mobile peer-to-peer systems such as frequent disconnections, frequent network partitions, and peer heterogeneity. This research focuses on developing the necessary models, techniques and algorithms that will enable us to build and deploy collaborative applications in the Internet enabled, mobile peer-to-peer environments. This dissertation proposes a multi-state transaction model and develops a quality aware transaction processing framework to incorporate quality of service with transaction processing. It proposes adaptive ACID properties and develops a quality specification language to associate a quality level with transactions. In addition, this research develops a probabilistic concurrency control mechanism and a group based transaction commit protocol for mobile peer-to-peer systems that greatly reduces blockings in transactions and improves the transaction commit ratio. To the best of our knowledge, this is the first attempt to systematically support disconnection-tolerant and partition-tolerant transaction processing. This dissertation also develops a scalable directory service called PeerDS to support the above framework. It addresses the scalability and dynamism of the directory service from two aspects: peer-to-peer and push-pull hybrid interfaces. It also addresses peer heterogeneity and develops a new technique for load balancing in the peer-to-peer system. This technique comprises an improved routing algorithm for virtualized P2P overlay networks and a generalized Top-K server selection algorithm for load balancing, which could be optimized based on multiple factors such as proximity and cost. The proposed push-pull hybrid interfaces greatly reduce the overhead of directory servers caused by frequent queries from directory clients. In order to further improve the scalability of the push interface, this dissertation also studies and evaluates different filter indexing schemes through which the interests of each update could be calculated very efficiently. This dissertation was developed in conjunction with the middleware called System on Mobile Devices (SyD).

APA, Harvard, Vancouver, ISO, and other styles

31

Long, J. A. "Implementing a heterogeneous relational database node." Thesis, University of Bristol, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.355094.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Ericsson, Joakim. "Object Migration in a Distributed, Heterogeneous SQL Database Network." Thesis, Linköpings universitet, Databas och informationsteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-148181.

Full text

Abstract:

There are many different database management systems (DBMSs) on the market today. They all have different strengths and weaknesses. What if all of these different DBMSs could be used together in a heterogeneous network? The purpose of this thesis is to explore ways of connecting the many different DBMSs together. This thesis will explore suitable architectures, features, and performance of such a network. This is all done in the context of Ericsson’s wireless communication network. This has not been done in this context before, and a big part of the thesis is exploring if it is even possible. The result of this thesis shows that it is not possible to find a solution that can fulfill the requirements of such a network in this context

APA, Harvard, Vancouver, ISO, and other styles

33

Qiu, Xiangbin. "A Publish-Subscribe System for Data Replication and Synchronization Among Integrated Person-Centric Information Systems." DigitalCommons@USU, 2010. https://digitalcommons.usu.edu/etd/620.

Full text

Abstract:

Synchronization of data across an integrated system of heterogeneous databases is a difficult but important task, especially in the context of integrating health care information throughout a region, state, or nation. This thesis describes the design and implementation of a data replication and synchronization tool, called the Sync Engine, which allows users to define custom data-sharing patterns and transformations for an integrated system of heterogeneous person-centric databases. This thesis also discusses the relationship between the Sync Engine's contributions and several relevant issues in the area of data integration and replication. The Sync Engine's design and implementation was validated by adapting it to CHARM, a real world integrated system currently in use at the Utah Department of Health.

APA, Harvard, Vancouver, ISO, and other styles

34

Ismail, Hanafy Mahmoud. "An object-oriented query processing subsystem in a heterogeneous distributed database environment." Thesis, University of Kent, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.236763.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Graniela, Benito. "Harmony an architecture for network centric heterogeneous terrain database re-generation." Doctoral diss., University of Central Florida, 2011. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4768.

Full text

Abstract:

This research investigated an alternative modeling and simulation terrain database generation paradigm that rapidly harmonizes changes target formats throughout a distributed simulation system while accommodating bandwidth and processing time limitations. This dissertation proposes a "distributed partial bi-directional terrain database re-generation" paradigm, which envisions network based terrain database updates between reliable partners. The approach is very attractive as it reduces the amount of processing and bandwidth required to distribute locally emergent changes throughout a distributed system by only updating the affected target format data elements. In the prototype theoretical architecture that implements the approach, agent theory and ontologies are used to interpret data changes in external target formats and implement the necessary transformations on a server internal terrain database generation system. These changes are then distributed to clients to achieve consistency between all correlated representations. Experimental findings with the prototype suggests smaller network utilization and processing times than conventional terrain database generation will experience while maintaining correlated heterogeneous terrain database representations overtime. This Bi-Directional Ontology-driven TDB Re-Generation Architecture has the potential to revolutionize the traditional terrain database generation pipeline paradigm.
ID: 031001424; System requirements: World Wide Web browser and PDF reader.; Mode of access: World Wide Web.; Title from PDF title page (viewed June 19, 2013).; Thesis (Ph.D.)--University of Central Florida, 2011.; Includes bibliographical references (p. 503-508).
Ph.D.
Doctorate
Industrial Engineering and Management Systems
Engineering and Computer Science
Modeling and Simulation

APA, Harvard, Vancouver, ISO, and other styles

36

Rheinheimer, Letícia Rafaela. "WSAgent: um agente baseado em Web Services para promover a interoperabilidade entre sistemas heterogêneos no domínio da saúde." Universidade do Vale do Rio do Sinos, 2004. http://www.repositorio.jesuita.org.br/handle/UNISINOS/2205.

Full text

Abstract:

Made available in DSpace on 2015-03-05T13:53:44Z (GMT). No. of bitstreams: 0 Previous issue date: 21
Nenhuma
Após o advento da Internet, diversas estratégias de desenvolvimento de software foram modificadas para promover maior reuso e interoperabilidade. Design Patterns e Frameworks nos ajudam a criar software e design flexíveis. A idéia de compor aplicações para que trabalhem juntas é bastante atrativa. No entanto, no domínio da saúde, surgem diversos empecilhos para que se realize esta integração. O uso de tecnologias de Agentes em conjunto com Web Services nos permite pensar em uma solução que garanta interoperabilidade, reuso e flexibilidade entre ambientes heterogêneos. Este trabalho descreve a arquitetura de um Agente de Software, chamado WSAgent (que consiste de uma instância de um Framelet para o sub-domínio paciente, no domínio da saúde) e suas estratégias de colaboração e interoperabilidade. Este trabalho também apresenta um estudo de caso com implementação de um protótipo
After the Internet advent, several strategies about software development were changed to promote more reuse and interoperability. Design Patterns and Frameworks help us to create software and design flexible. The idea of glue applications to work together is very attractive. In the health domains, there are many drawbacks to address its goals. The use of agent technologies combined with Web Services allow us to think about the construction of a bind to grant interoperability, reuse and flexibility between heterogeneous environments. This work describes the architecture of a software agent called WSAgent – an instance of a Framelet of Patient subdomain in Health domain – and its strategies of collaborations and interoperability. This work also presents a case study with the implementation of a prototype

APA, Harvard, Vancouver, ISO, and other styles

37

Ganesan, Shankaranarayanan. "Dynamic schema evolution in a heterogeneous database environment: A graph theoretic approach." Diss., The University of Arizona, 1998. http://hdl.handle.net/10150/282767.

Full text

Abstract:

The objective of this dissertation is to create a theoretical framework and mechanisms for automating dynamic schema evolution in a heterogeneous database environment. The structure or schema of databases changes over time. Accommodating changes to the schema without loss of existing data and without significantly affecting the day to day operation of the database is the management of dynamic schema evolution. To address the problem of schema evolution in a heterogeneous database environment, we first propose a comprehensive taxonomy of schema changes and examine their implications. We then propose a formal methodology for managing schema evolution using graph theory with a well-defined set of operators and graph-based algorithms for tracking and propagating schema changes. We show that these operators and algorithms preserve the consistency and correctness of the schema following the changes. The complete framework is embedded in prototype software system called SEMAD (Schema Evolution Management ADvisor). We evaluate the system for its usefulness by conducting exploratory case studies using two different heterogeneous database domains, viz., a University database environment and a scientific database environment that is used by atmospheric scientists and hydrologists. The results of the exploratory case studies supported the hypothesis that SEMAD does help database administrators in their tasks. The results indicate that SEMAD helps the administrators identify and incorporate changes better than performing these tasks manually. An important overhead cost in SEMAD is the creation of the semantic data model, capturing the meta data associated with the model, and defining the mapping information that relates the model and the set of underlying databases. This task is a one-time effort that is performed at the beginning. The subsequent changes are incrementally captured by SEMAD. However, the benefits of using SEMAD in dynamically managing schema evolution appear to offset this overhead cost.

APA, Harvard, Vancouver, ISO, and other styles

38

Bowman, Kelly Eric. "Neutral Parametric Database, Server, Logic Layers, and Clients to Facilitate Multi-EngineerSynchronous Heterogeneous CAD." BYU ScholarsArchive, 2016. https://scholarsarchive.byu.edu/etd/5656.

Full text

Abstract:

Engineering companies are sociotechnical systems in which engineers, designers, analysts, etc. use an array of software tools to follow prescribed product-development processes. The purpose of these amalgamated systems is to develop new products as quickly as possible while maintaining quality and meeting customer and market demands. Researchers at Brigham Young University have shortened engineering design cycle times through the development and use of multiengineer synchronous (MES) CAD tools. Other research teams have shortened design cycle-times by extending seamless interoperability across heterogeneous design tools and domains. Seamless multi-engineer synchronous heterogeneous (MESH) CAD environments is the focus of this dissertation. An architecture that supports both MES collaboration and interoperability is defined, tested for robustness, and proposed as the start of a new standard for interoperability. An N-tiered architecture with four layers is used. These layers are data storage, server communication, business logic, and client. Perhaps the most critical part of the architecture is the new neutral parametric database (NPDB) standard which can generically store associative CAD geometry from heterogeneous CAD systems. A practical application has been developed using the architecture which demonstrates design and modeling interoperability between Siemens NX, PTC's Creo, and Dassault Systemes CATIA CAD applications; Interoperability between Siemens' NX and Dassault Systemes' CATIA are specifically outlined in this dissertation. The 2D point, 2D line, 2D arc, 2D circle, 2D spline, 3D point, extrude, and revolve features have been developed. Complex models have successfully been modeled and exchanged in real time across heterogeneous CAD clients and have validated this approach for MESH CAD collaboration.

APA, Harvard, Vancouver, ISO, and other styles

39

Claypool, Kajal Tilak. "Managing schema change in an heterogeneous environment." Link to electronic thesis, 2002. http://www.wpi.edu/Pubs/ETD/Available/etd-0617102-213436.

Full text

Abstract:

Thesis (Ph. D.)--Worcester Polytechnic Institute.
Keywords: Meta modeling; schema change; frameworks; integration; schema heterogeniety; schema modeling. Includes bibliographical references (p. 381-395).

APA, Harvard, Vancouver, ISO, and other styles

40

Herrmann, Kai, Hannes Voigt, and Wolfgang Lehner. "Online horizontal partitioning of heterogeneous data." De Gruyter, 2014. https://tud.qucosa.de/id/qucosa%3A72923.

Full text

Abstract:

In an increasing number of use cases, databases face the challenge of managing heterogeneous data. Heterogeneous data is characterized by a quickly evolving variety of entities without a common set of attributes. These entities do not show enough regularity to be captured in a traditional database schema. A common solution is to centralize the diverse entities in a universal table. Usually, this leads to a very sparse table. Although today’s techniques allow efficient storage of sparse universal tables, query efficiency is still a problem. Queries that address only a subset of attributes have to read the whole universal table includingmany irrelevant entities. Asolution is to use a partitioning of the table, which allows pruning partitions of irrelevant entities before they are touched. Creating and maintaining such a partitioning manually is very laborious or even infeasible, due to the enormous complexity. Thus an autonomous solution is desirable. In this article, we define the Online Partitioning Problem for heterogeneous data. We sketch how an optimal solution for this problem can be determined based on hypergraph partitioning. Although it leads to the optimal partitioning, the hypergraph approach is inappropriate for an implementation in a database system. We present Cinderella, an autonomous online algorithm for horizontal partitioning of heterogeneous entities in universal tables. Cinderella is designed to keep its overhead low by operating online; it incrementally assigns entities to partition while they are touched anyway duringmodifications. This enables a reasonable physical database design at runtime instead of static modeling.

APA, Harvard, Vancouver, ISO, and other styles

41

Li, Jianxin. "Adaptive query relaxation and processing over heterogeneous xml data sources." Swinburne Research Bank, 2009. http://hdl.handle.net/1959.3/66874.

Full text

Abstract:

Thesis (Ph.D) - Swinburne University of Technology, Faculty of Information & Communication Technologies, 2009.
A dissertation submitted to the Faculty of Information and Communication Technologies, Swinburne University of Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy, 2009. Typescript. "August 2009". Bibliography p. 161-171.

APA, Harvard, Vancouver, ISO, and other styles

42

Soussi, Rania. "Querying and extracting heterogeneous graphs from structured data and unstrutured content." Phd thesis, Ecole Centrale Paris, 2012. http://tel.archives-ouvertes.fr/tel-00740663.

Full text

Abstract:

The present work introduces a set of solutions to extract graphs from enterprise data and facilitate the process of information search on these graphs. First of all we have defined a new graph model called the SPIDER-Graph, which models complex objects and permits to define heterogeneous graphs. Furthermore, we have developed a set of algorithms to extract the content of a database from an enterprise and to represent it in this new model. This latter representation allows us to discover relations that exist in the data but are hidden due to their poor compatibility with the classical relational model. Moreover, in order to unify the representation of all the data of the enterprise, we have developed a second approach which extracts from unstructured data an enterprise's ontology containing the most important concepts and relations that can be found in a given enterprise. Having extracted the graphs from the relational databases and documents using the enterprise ontology, we propose an approach which allows the users to extract an interaction graph between a set of chosen enterprise objects. This approach is based on a set of relations patterns extracted from the graph and the enterprise ontology concepts and relations. Finally, information retrieval is facilitated using a new visual graph query language called GraphVQL, which allows users to query graphs by drawing a pattern visually for the query. This language covers different query types from the simple selection and aggregation queries to social network analysis queries.

APA, Harvard, Vancouver, ISO, and other styles

43

Park, Jinsoo. "Facilitating interoperability among heterogeneous geographic database systems: A theoretical framework, a prototype system, and evaluation." Diss., The University of Arizona, 1999. http://hdl.handle.net/10150/284627.

Full text

Abstract:

The objective of this research is to develop a formal semantic model, theoretical framework and methodology to facilitate interoperability among distributed and heterogeneous geographic database systems (GDSs). The primary research question is how to identify and resolve various data- and schematic-level conflicts among such information sources. Set theory is used to formalize the semantic model, which supports explicit modeling of the complex nature of geographic data objects. The semantic model is used as a canonical model for conceptual schema design and integration. The intension (including structure, integrity rules and meta-properties) of the database schema is captured in the semantic model. A comprehensive framework classifying various semantic conflicts is proposed. This framework is then used as a basis for automating the detection and resolution of semantic conflicts among heterogeneous databases. A methodology for conflict detection and resolution is proposed to develop interoperable system environment. The methodology is based on the concept of a "mediator." Several types of semantic mediators are defined and developed to achieve interoperability. An ontology is developed to capture various semantic conflicts. The metadata and ontology are stored in a common repository and manipulated by description logic-based operators. A query processing technique is developed to provide uniform and integrated access to the multiple heterogeneous databases. Logic is employed to formalize our methodology, which provides a unified view of the underlying representational and reasoning formalism for the semantic mediation process. A usable prototype system is implemented to provide proof of the concept underlying this work. The system has been integrated with the Internet and can be accessed through any Java-enabled web browser. Finally, the usefulness of our methodology and the system is evaluated using three different cases that represent different application domains. Various heterogeneous geospatial datasets and non-geographic datasets are used during the evaluation phase. The results of the evaluation suggest that correct identification and construction of both schema and ontology-schema mapping knowledge play very important roles in achieving interoperability at the both data and schema levels. The research adopts a multi-methodological approach that incorporates set theory, logic, prototyping, and case study.

APA, Harvard, Vancouver, ISO, and other styles

44

Shumway, Devin James. "Hybrid State-Transactional Database for Product Lifecycle Management Features in Multi-Engineer Synchronous Heterogeneous Computer-Aided Design." BYU ScholarsArchive, 2017. https://scholarsarchive.byu.edu/etd/6341.

Full text

Abstract:

There are many different programs that can perform Computer Aided Design (CAD). In order for these programs to share data, file translations need to occur. These translations have typically been done by IGES and STEP files. With the work done at the BYU CAD Lab to create a multi-engineer synchronous heterogeneous CAD environment, these translation processes have become synchronous by using a server and a database to manage the data. However, this system stores part data in a database. The data in the database cannot be used in traditional Product Lifecycle Management systems. In order to remedy this, a new database was developed that enables every edit made in a CAD part across multiple CAD systems to be stored as well as worked on simultaneously. This allows users to access every action performed in a part. Branching was introduced to the database which allows users to work on multiple configurations of a part simultaneously and reduces file save sizes for different configurations by 98.6% compared to those created by traditional CAD systems.

APA, Harvard, Vancouver, ISO, and other styles

45

Mayr, Philipp. "Re-Ranking auf Basis von Bradfordizing für die verteilte Suche in digitalen Bibliotheken." Doctoral thesis, Humboldt-Universität zu Berlin, Philosophische Fakultät I, 2009. http://dx.doi.org/10.18452/15906.

Full text

Abstract:

Trotz großer Dokumentmengen für datenbankübergreifende Literaturrecherchen erwarten akademische Nutzer einen möglichst hohen Anteil an relevanten und qualitativen Dokumenten in den Trefferergebnissen. Insbesondere die Reihenfolge und Struktur der gelisteten Ergebnisse (Ranking) spielt, neben dem direkten Volltextzugriff auf die Dokumente, inzwischen eine entscheidende Rolle beim Design von Suchsystemen. Nutzer erwarten weiterhin flexible Informationssysteme, die es unter anderem zulassen, Einfluss auf das Ranking der Dokumente zu nehmen bzw. alternative Rankingverfahren zu verwenden. In dieser Arbeit werden zwei Mehrwertverfahren für Suchsysteme vorgestellt, die die typischen Probleme bei der Recherche nach wissenschaftlicher Literatur behandeln und damit die Recherchesituation messbar verbessern können. Die beiden Mehrwertdienste semantische Heterogenitätsbehandlung am Beispiel Crosskonkordanzen und Re-Ranking auf Basis von Bradfordizing, die in unterschiedlichen Phasen der Suche zum Einsatz kommen, werden hier ausführlich beschrieben und im empirischen Teil der Arbeit bzgl. der Effektivität für typische fachbezogene Recherchen evaluiert. Vorrangiges Ziel der Promotion ist es, zu untersuchen, ob das hier vorgestellte alternative Re-Rankingverfahren Bradfordizing im Anwendungsbereich bibliographischer Datenbanken zum einen operabel ist und zum anderen voraussichtlich gewinnbringend in Informationssystemen eingesetzt und dem Nutzer angeboten werden kann. Für die Tests wurden Fragestellungen und Daten aus zwei Evaluationsprojekten (CLEF und KoMoHe) verwendet. Die intellektuell bewerteten Dokumente stammen aus insgesamt sieben wissenschaftlichen Fachdatenbanken der Fächer Sozialwissenschaften, Politikwissenschaft, Wirtschaftswissenschaften, Psychologie und Medizin. Die Evaluation der Crosskonkordanzen (insgesamt 82 Fragestellungen) zeigt, dass sich die Retrievalergebnisse signifikant für alle Crosskonkordanzen verbessern; es zeigt sich zudem, dass interdisziplinäre Crosskonkordanzen den stärksten (positiven) Effekt auf die Suchergebnisse haben. Die Evaluation des Re-Ranking nach Bradfordizing (insgesamt 164 Fragestellungen) zeigt, dass die Dokumente der Kernzone (Kernzeitschriften) für die meisten Testreihen eine signifikant höhere Precision als Dokumente der Zone 2 und Zone 3 (Peripheriezeitschriften) ergeben. Sowohl für Zeitschriften als auch für Monographien kann dieser Relevanzvorteil nach Bradfordizing auf einer sehr breiten Basis von Themen und Fragestellungen an zwei unabhängigen Dokumentkorpora empirisch nachgewiesen werden.
In spite of huge document sets for cross-database literature searches, academic users expect a high ratio of relevant and qualitative documents in result sets. It is particularly the order and structure of the listed results (ranking) that play an important role when designing search systems alongside the direct full text access for documents. Users also expect flexible information systems which allow influencing the ranking of documents and application of alternative ranking techniques. This thesis proposes two value-added approaches for search systems which treat typical problems in searching scientific literature and seek to improve the retrieval situation on a measurable level. The two value-added services, semantic treatment of heterogeneity (the example of cross-concordances) and re-ranking on Bradfordizing, which are applied in different search phases, are described in detail and their effectiveness in typical subject-specific searches is evaluated in the empirical part of the thesis. The preeminent goal of the thesis is to study if the proposed, alternative re-ranking approach Bradfordizing is operable in the domain of bibliographic databases, and if the approach is profitable, i.e. serves as a value added, for users in information systems. We used topics and data from two evaluation projects (CLEF and KoMoHe) for the tests. The intellectually assessed documents come from seven academic abstracting and indexing databases representing social science, political science, economics, psychology and medicine. The evaluation of the cross-concordances (82 topics altogether) shows that the retrieval results improve significantly for all cross-concordances, indicating that interdisciplinary cross-concordances have the strongest (positive) effect on the search results. The evaluation of Bradfordizing re-ranking (164 topics altogether) shows that core zone (core journals) documents display significantly higher precision than was seen for documents in zone 2 and zone 3 (periphery journals) for most test series. This post-Bradfordizing relevance advantage can be demonstrated empirically across a very broad basis of topics and two independent document corpora as well for journals and monographs.

APA, Harvard, Vancouver, ISO, and other styles

46

Karnagel, Tomas. "Heterogeneity-Aware Placement Strategies for Query Optimization." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-225613.

Full text

Abstract:

Computing hardware is changing from systems with homogeneous CPUs to systems with heterogeneous computing units like GPUs, Many Integrated Cores, or FPGAs. This trend is caused by scaling problems of homogeneous systems, where heat dissipation and energy consumption is limiting further growths in compute-performance. Heterogeneous systems provide differently optimized computing hardware, which allows different operations to be computed on the most appropriate computing unit, resulting in faster execution and less energy consumption. For database systems, this is a new opportunity to accelerate query processing, allowing faster and more interactive querying of large amounts of data. However, the current hardware trend is also a challenge as most database systems do not support heterogeneous computing resources and it is not clear how to support these systems best. In the past, mainly single operators were ported to different computing units showing great results, while missing a system wide application. To efficiently support heterogeneous systems, a systems approach for query processing and query optimization is needed. In this thesis, we tackle the optimization challenge in detail. As a starting point, we evaluate three different approaches on isolated use-cases to assess their advantages and limitations. First, we evaluate a fork-join approach of intra-operator parallelism, where the same operator is executed on multiple computing units at the same time, each execution with different data partitions. Second, we evaluate using one computing unit statically to accelerate one operator, which provides high code-optimization potential, due to this static and pre-known usage of hardware and software. Third, we evaluate dynamically placing operators onto computing units, depending on the operator, the available computing hardware, and the given data sizes. We argue that the first and second approach suffer from multiple overheads or high implementation costs. The third approach, dynamic placement, shows good performance, while being highly extensible to different computing units and different operator implementations. To automate this dynamic approach, we first propose general placement optimization for query processing. This general approach includes runtime estimation of operators on different computing units as well as two approaches for defining the actual operator placement according to the estimated runtimes. The two placement approaches are local optimization, which decides the placement locally at run-time, and global optimization, where the placement is decided at compile-time, while allowing a global view for enhanced data sharing. The main limitation of the latter is the high dependency on cardinality estimation of intermediate results, as estimation errors for the cardinalities propagate to the operator runtime estimation and placement optimization. Therefore, we propose adaptive placement optimization, allowing the placement optimization to become fully independent of cardinalities estimation, effectively eliminating the main source of inaccuracy for runtime estimation and placement optimization. Finally, we define an adaptive placement sequence, incorporating all our proposed techniques of placement optimization. We implement this sequence as a virtualization layer between the database system and the heterogeneous hardware. Our implementation approach bases on preexisting interfaces to the database system and the hardware, allowing non-intrusive integration into existing database systems. We evaluate our techniques using two different database systems and two different OLAP benchmarks, accelerating the query processing through heterogeneous execution.

APA, Harvard, Vancouver, ISO, and other styles

47

Huang, Chi-Ming, and 黃啟銘. "Heterogeneous Databases Integration System In Medicine." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/18042554497386129791.

Full text

Abstract:

碩士
國立臺灣大學
醫學工程學研究所
91
A variety of application systems implemented by different kinds of database systems is presented in enterprises and hospitals. It is a hazard for enterprises and organizations for further usages, such as decision-making and data analysis. Hence, to solve the problem, it is necessary to provide a database middleware system, which can facilitate the users to integrate heterogeneous data sources conveniently. This thesis proposes a heterogeneous database integration system performing extraction, transformation, loading. There are three phases for this: firstly, build up a semi-automatic heterogeneous database integration system to let users choose the tables which users want to join and align the relationships of tables. Then the system will transfer the related tables and data to the database containers according to the relationships. Finally, join data from heterogeneous database tables.

APA, Harvard, Vancouver, ISO, and other styles

48

"A Methodology for integration of heterogeneous databases." Productivity From Information Technology, "PROFIT" Research Initiative, Sloan School of Management, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/2556.

Full text

Abstract:

M.P. Reddy ... [et al.]
Reprint. Reprinted from IEEE transactions on knowledge and data engineering. Vol. 6, no. 6 (Dec. 1994) "December 1994."
Includes bibliographical references (p. 932).
Supported by the Productivity From Information Technology (PROFIT) Research Initiative at MIT.

APA, Harvard, Vancouver, ISO, and other styles

49

Luo, Tzong-Shyan, and 羅宗賢. "Solving Domain Mismatch Problems in Heterogeneous Databases." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/31316473295371273625.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Wang, Wei-Cheng, and 王緯誠. "Heterogeneous Databases Integration System Base on XML." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/95253059899361602620.

Full text

Abstract:

碩士
國立臺灣大學
醫學工程學研究所
93
After the development for many years , database system has become one of the most used tools when people store data . However , not all large-sized institution(ex. hospital) use the same database system , so various types of these database systems have some difference , which always cause the difficulty of integration of the database systems . In view of the increasing of data mining ,how to integrate heterogeneous database will be the key to the problem. According to former studies , to solve the problem of integration of heterogeneous database systems is often database oriented , designing middleware system among database , using distributed query to integrate information from different database systems , or using single database as a common repository to decrease the differences of database and benefit the integration of database . However ,after the announcement of XML( eXtensible Markup Language) ,the dealing with data has the same standard to follow , and what’s more , the characteristics of documents can help us deal with effectively the integration of the structure data and semi-structure data. There are seldom discussed in former studies. The thesis probes into available tools and methods of integration of database systems uses XML standard as the basis of system and design a graphic user interface to let users extract , integrate and query many heterogeneous database system at same time . Through easy operating and without understanding the difference of different database. Besides , the thesis uses XML’s characteristics to combine documents and heterogeneous database systems to make the integration of heterogeneous database no longer limited in the structured database.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Heterogeneous databases'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles