Thèses sur le sujet « Temporal clinical data warehouse »

Pour voir les autres types de publications sur ce sujet consultez le lien suivant : Temporal clinical data warehouse.

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 31 meilleures thèses pour votre recherche sur le sujet « Temporal clinical data warehouse ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.

1

Jamjoom, Arwa. « Transitioning a clinical unit to a data warehouse ». Thesis, University of Surrey, 2011. http://epubs.surrey.ac.uk/804656/.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
2

Tsuruda, Renata Miwa. « STB-index : um índice baseado em bitmap para data warehouse espaço-temporal ». Universidade Federal de São Carlos, 2012. https://repositorio.ufscar.br/handle/ufscar/525.

Texte intégral
Résumé :
Made available in DSpace on 2016-06-02T19:06:04Z (GMT). No. of bitstreams: 1 5138.pdf: 2676227 bytes, checksum: 72ab4695bfe8833d7d34d1e803a6ec9a (MD5) Previous issue date: 2012-12-13
Financiadora de Estudos e Projetos
The growing concern with the support of the decision-making process has made companies to search technologies that support their decisions. The technology most widely used presently is the Data Warehouse (DW), which allows storing data so it is possible to produce useful and reliable information to assist in strategic decisions. Combining the concepts of Spatial Data Warehouse (SDW), that allows geometry storage and managing, and Temporal Data Warehouse (TDW), which allows storing data changes that occur in the real-world, a research topic known as Spatio-Temporal Data Warehouse (STDW) has emerged. STDW are suitable for the treatment of geometries that change over time. These technologies, combined with the steady growth volume of data, show the necessity of index structures to improve the performance of analytical query processing with spatial predicates and also with geometries that may vary over time. In this sense, this work focused on proposing an index for STDW called Spatio-Temporal Bitmap Index, or STB-index. The proposed index was designed to processing drill-down and roll-up queries considering the existence of predefined spatial hierarchies and with spatial attributes that can vary its position and shape over time. The validation of STB-index was performed by conducting experimental tests using a DWET created from synthetic data. Tests evaluated the elapsed time and the number of disk accesses to construct the index, the amount of storage space of the index and the elapsed time and the number of disk accesses for query processing. Results were compared with query processing using database management system resources and STBindex improved the query performance by 98.12% up to 99.22% in response time compared to materialized views.
A crescente preocupação com o suporte ao processo de tomada de decisão estratégica fez com que as empresas buscassem tecnologias que apoiassem as suas decisões. A tecnologia mais utilizada atualmente é a de Data Warehouse (DW), que permite armazenar dados de forma que seja possível produzir informação útil e confiável para auxiliar na tomada de decisão estratégica. Aliando-se os conceitos de Data Warehouse Espacial (DWE), que permite o armazenamento e o gerenciamento de geometrias, e de Data Warehouse Temporal (DWT), que possibilita representar as mudanças nos dados que ocorrem no mundo real, surgiu o tema de pesquisa conhecido por Data Warehouse Espaço-Temporal (DWET), que é próprio para o tratamento de geometrias que se alteram ao longo do tempo. Essas tecnologias, aliadas ao constante crescimento no volume de dados armazenados, evidenciam a necessidade de estruturas de indexação que melhorem o desempenho do processamento de consultas analíticas com predicados espaciais e com variação das geometrias no tempo. Nesse sentido, este trabalho se concentrou na proposta de um índice para DWET denominado Spatio- Temporal Bitmap Index, ou STB-index. O índice proposto foi projetado para o processamento de consultas do tipo drill-down e roll-up considerando a existência de hierarquias espaciais predefinidas, sendo que os atributos espaciais podem variar sua posição e sua forma ao longo do tempo. A validação do STB-index ocorreu por meio da realização de testes experimentais utilizando um DWET criado a partir de dados sintéticos. Os testes avaliaram o tempo e o número de acessos a disco para a construção do índice, a quantidade de espaço para armazenamento do índice e o tempo e número de acessos a disco para o processamento de consultas analíticas. Os resultados obtidos foram comparados com o processamento de consultas utilizando os recursos disponíveis dos sistemas gerenciadores de banco de dados, sendo que o STB-index apresentou um ganho de desempenho entre 98,12% e 99,22% no tempo de resposta das consultas se comparado ao uso de visões materializadas.
Styles APA, Harvard, Vancouver, ISO, etc.
3

Veronica, Ruiz Castro Carla. « CSTM : a conceptual spatiotemporal model for data warehouses ». Universidade Federal de Pernambuco, 2010. https://repositorio.ufpe.br/handle/123456789/2209.

Texte intégral
Résumé :
Made available in DSpace on 2014-06-12T15:55:27Z (GMT). No. of bitstreams: 2 arquivo2237_1.pdf: 2502933 bytes, checksum: d37d2a43704ed90dc4402e107f41af53 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2010
Conselho Nacional de Desenvolvimento Científico e Tecnológico
Estudos abrangentes relacionados a data warehouse temporais e espaciais têm sido conduzidos. Data warehouse temporais permitem lidar com dados variáveis no tempo tanto em tabelas de fatos quanto em tabelas de dimensões. Uma ampla variedade de aplicações precisa capturar não só características espaciais, mas também temporais das entidades modeladas. Entretanto, estudos que unam essas duas áreas de pesquisa não têm sido suficientemente considerados. É neste contexto que o presente trabalho de dissertação está definido. Ele propõe um modelo conceitual para data warehouses espaço temporais. Este modelo permite aos usuários definir níveis, hierarquias e dimensões tanto com características espaciais como temporais. Como consequência disso, é possível representar atributos espaciais variáveis no tempo. Além disso, este trabalho define um conjunto de operadores espaço temporais que poderia ser útil na consulta de data warehouses espaço temporais. Diferentemente de propostas existentes, nossos operadores integram não só operadores multidimensionais e espaciais, mas também espaciais e temporais (i.e., espaço temporais) em uma única sintaxe. Um esquema taxonômico, o qual classifica os operadores propostos, também é definido. A importância da taxonomia proposta é que ajuda no desenvolvimento de tecnologia OLAP espaço temporal. Com o objetivo de automatizar a modelagem de esquemas espaço temporais, uma ferramenta CASE foi desenvolvida. Além de permitir a definição de esquemas conformes com o modelo conceitual proposto, esta ferramenta também permite a geração automática do esquema lógico correspondente usando uma abordagem objeto relacional. As ideias propostas são validadas com um estudo de caso na área meteorológica. O estudo apresenta uma aplicação prática do modelo conceitual espaço temporal e dos operadores espaço temporais apresentados neste trabalho
Styles APA, Harvard, Vancouver, ISO, etc.
4

Filannino, Michele. « Data-driven temporal information extraction with applications in general and clinical domains ». Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/datadriven-temporal-information-extraction-with-applications-in-general-and-clinical-domains(34d7e698-f8a8-4fbf-b742-d522c4fe4a12).html.

Texte intégral
Résumé :
The automatic extraction of temporal information from written texts is pivotal for many Natural Language Processing applications such as question answering, text summarisation and information retrieval. However, Temporal Information Extraction (TIE) is a challenging task because of the amount of types of expressions (durations, frequencies, times, dates) and their high morphological variability and ambiguity. As far as the approaches are concerned, the most common among the existing ones is rule-based, while data-driven ones are under-explored. This thesis introduces a novel domain-independent data-driven TIE strategy. The identification strategy is based on machine learning sequence labelling classifiers on features selected through an extensive exploration. Results are further optimised using an a posteriori label-adjustment pipeline. The normalisation strategy is rule-based and builds on a pre-existing system. The methodology has been applied to both specific (clinical) and generic domain, and has been officially benchmarked at the i2b2/2012 and TempEval-3 challenges, ranking respectively 3rd and 1st. The results prove the TIE task to be more challenging in the clinical domain (overall accuracy 63%) rather than in the general domain (overall accuracy 69%).Finally, this thesis also presents two applications of TIE. One of them introduces the concept of temporal footprint of a Wikipedia article, and uses it to mine the life span of persons. In the other case, TIE techniques are used to improve pre-existing information retrieval systems by filtering out temporally irrelevant results.
Styles APA, Harvard, Vancouver, ISO, etc.
5

Mawilmada, Pubudika Kumari. « Impact of a data warehouse model for improved decision-making process in healthcare ». Thesis, Queensland University of Technology, 2011. https://eprints.qut.edu.au/47532/1/Pubudika_Mawilmada_Thesis.pdf.

Texte intégral
Résumé :
The health system is one sector dealing with a deluge of complex data. Many healthcare organisations struggle to utilise these volumes of health data effectively and efficiently. Also, there are many healthcare organisations, which still have stand-alone systems, not integrated for management of information and decision-making. This shows, there is a need for an effective system to capture, collate and distribute this health data. Therefore, implementing the data warehouse concept in healthcare is potentially one of the solutions to integrate health data. Data warehousing has been used to support business intelligence and decision-making in many other sectors such as the engineering, defence and retail sectors. The research problem that is going to be addressed is, "how can data warehousing assist the decision-making process in healthcare". To address this problem the researcher has narrowed an investigation focusing on a cardiac surgery unit. This research used the cardiac surgery unit at the Prince Charles Hospital (TPCH) as the case study. The cardiac surgery unit at TPCH uses a stand-alone database of patient clinical data, which supports clinical audit, service management and research functions. However, much of the time, the interaction between the cardiac surgery unit information system with other units is minimal. There is a limited and basic two-way interaction with other clinical and administrative databases at TPCH which support decision-making processes. The aims of this research are to investigate what decision-making issues are faced by the healthcare professionals with the current information systems and how decision-making might be improved within this healthcare setting by implementing an aligned data warehouse model or models. As a part of the research the researcher will propose and develop a suitable data warehouse prototype based on the cardiac surgery unit needs and integrating the Intensive Care Unit database, Clinical Costing unit database (Transition II) and Quality and Safety unit database [electronic discharge summary (e-DS)]. The goal is to improve the current decision-making processes. The main objectives of this research are to improve access to integrated clinical and financial data, providing potentially better information for decision-making for both improved from the questionnaire and by referring to the literature, the results indicate a centralised data warehouse model for the cardiac surgery unit at this stage. A centralised data warehouse model addresses current needs and can also be upgraded to an enterprise wide warehouse model or federated data warehouse model as discussed in the many consulted publications. The data warehouse prototype was able to be developed using SAS enterprise data integration studio 4.2 and the data was analysed using SAS enterprise edition 4.3. In the final stage, the data warehouse prototype was evaluated by collecting feedback from the end users. This was achieved by using output created from the data warehouse prototype as examples of the data desired and possible in a data warehouse environment. According to the feedback collected from the end users, implementation of a data warehouse was seen to be a useful tool to inform management options, provide a more complete representation of factors related to a decision scenario and potentially reduce information product development time. However, there are many constraints exist in this research. For example the technical issues such as data incompatibilities, integration of the cardiac surgery database and e-DS database servers and also, Queensland Health information restrictions (Queensland Health information related policies, patient data confidentiality and ethics requirements), limited availability of support from IT technical staff and time restrictions. These factors have influenced the process for the warehouse model development, necessitating an incremental approach. This highlights the presence of many practical barriers to data warehousing and integration at the clinical service level. Limitations included the use of a small convenience sample of survey respondents, and a single site case report study design. As mentioned previously, the proposed data warehouse is a prototype and was developed using only four database repositories. Despite this constraint, the research demonstrates that by implementing a data warehouse at the service level, decision-making is supported and data quality issues related to access and availability can be reduced, providing many benefits. Output reports produced from the data warehouse prototype demonstrated usefulness for the improvement of decision-making in the management of clinical services, and quality and safety monitoring for better clinical care. However, in the future, the centralised model selected can be upgraded to an enterprise wide architecture by integrating with additional hospital units’ databases.
Styles APA, Harvard, Vancouver, ISO, etc.
6

Dietrich, Georg [Verfasser], et Frank [Gutachter] Puppe. « Ad Hoc Information Extraction in a Clinical Data Warehouse with Case Studies for Data Exploration and Consistency Checks / Georg Dietrich ; Gutachter : Frank Puppe ». Würzburg : Universität Würzburg, 2019. http://d-nb.info/1191102610/34.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
7

Koylu, Caglar. « A Case Study In Weather Pattern Searching Using A Spatial Data Warehouse Model ». Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/2/12609573/index.pdf.

Texte intégral
Résumé :
Data warehousing and Online Analytical Processing (OLAP) technology has been used to access, visualize and analyze multidimensional, aggregated, and summarized data. Large part of data contains spatial components. Thus, these spatial components convey valuable information and must be included in exploration and analysis phases of a spatial decision support system (SDSS). On the other hand, Geographic Information Systems (GISs) provide a wide range of tools to analyze spatial phenomena and therefore must be included in the analysis phases of a decision support system (DSS). In this regard, this study aims to search for answers to the problem how to design a spatially enabled data warehouse architecture in order to support spatio-temporal data analysis and exploration of multidimensional data. Consequently, in this study, the concepts of OLAP and GISs are synthesized in an integrated fashion to maximize the benefits generated from the strengths of both systems by building a spatial data warehouse model. In this context, a multidimensional spatio-temporal data model is proposed as a result of this synthesis. This model addresses the integration problem of spatial, non-spatial and temporal data and facilitates spatial data exploration and analysis. The model is evaluated by implementing a case study in weather pattern searching.
Styles APA, Harvard, Vancouver, ISO, etc.
8

Hagen, Matthew. « Biological and clinical data integration and its applications in healthcare ». Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/54267.

Texte intégral
Résumé :
Answers to the most complex biological questions are rarely determined solely from the experimental evidence. It requires subsequent analysis of many data sources that are often heterogeneous. Most biological data repositories focus on providing only one particular type of data, such as sequences, molecular interactions, protein structure, or gene expression. In many cases, it is required for researchers to visit several different databases to answer one scientific question. It is essential to develop strategies to integrate disparate biological data sources that are efficient and seamless to facilitate the discovery of novel associations and validate existing hypotheses. This thesis presents the design and development of different integration strategies of biological and clinical systems. The BioSPIDA system is a data warehousing solution that integrates many NCBI databases and other biological sources on protein sequences, protein domains, and biological pathways. It utilizes a universal parser facilitating integration without developing separate source code for each data site. This enables users to execute fine-grained queries that can filter genes by their protein interactions, gene expressions, functional annotation, and protein domain representation. Relational databases can powerfully return and generate quickly filtered results to research questions, but they are not the most suitable solution in all cases. Clinical patients and genes are typically annotated by concepts in hierarchical ontologies and performance of relational databases are weakened considerably when traversing and representing graph structures. This thesis illustrates when relational databases are most suitable as well as comparing the performance benchmarks of semantic web technologies and graph databases when comparing ontological concepts. Several approaches of analyzing integrated data will be discussed to demonstrate the advantages over dependencies on remote data centers. Intensive Care Patients are prioritized by their length of stay and their severity class is estimated by their diagnosis to help minimize wait time and preferentially treat patients by their condition. In a separate study, semantic clustering of patients is conducted by integrating a clinical database and a medical ontology to help identify multi-morbidity patterns. In the biological area, gene pathways, protein interaction networks, and functional annotation are integrated to help predict and prioritize candidate disease genes. This thesis will present the results that were able to be generated from each project through utilizing a local repository of genes, functional annotations, protein interactions, clinical patients, and medical ontologies.
Styles APA, Harvard, Vancouver, ISO, etc.
9

Lluch-Ariet, Magí. « Contributions to efficient and secure exchange of networked clinical data : the MOSAIC system ». Doctoral thesis, Universitat Politècnica de Catalunya, 2016. http://hdl.handle.net/10803/388037.

Texte intégral
Résumé :
The understanding of certain data often requires the collection of similar data from different places to be analysed and interpreted. Multi-agent systems (MAS), interoperability standards (DICOM, HL7 or EN13606) and clinical Ontologies, are facilitating data interchange among different clinical centres around the world. However, as more and more data becomes available, and more heterogeneous this data gets, the task of accessing and exploiting the large number of distributed repositories to extract useful knowledge becomes increasingly complex. Beyond the existing networks and advances for data transfer, data sharing protocols to support multilateral agreements are useful to exploit the knowledge of distributed Data Warehouses. The access to a certain data set in a federated Data Warehouse may be constrained by the requirement to deliver another specific data set. When bilateral agreements between two nodes of a network are not enough to solve the constraints for accessing to a certain data set, multilateral agreements for data exchange can be a solution. The research carried out in this PhD Thesis comprises the design and implementation of a Multi-Agent System for multilateral exchange agreements of clinical data, and evaluate how those multilateral agreements increase the percentage of data collected by a single node from the total amount of data available in the network. Different strategies to reduce the number of messages needed to achieve an agreement are also considered. The results show that with this collaborative sharing scenario the percentage of data collected dramatically improve from bilateral agreements to multilateral ones, up to reach almost all data available in the network.
Styles APA, Harvard, Vancouver, ISO, etc.
10

Scheufele, Elisabeth Lee. « Medication recommendations vs. peer practice in pediatric levothyroxine dosing : a study of collective intelligence from a clinical data warehouse as a potential model for clinical decision support ». Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/47854.

Texte intégral
Résumé :
Thesis (S.M.)--Harvard-MIT Division of Health Sciences and Technology, 2009.
Includes bibliographical references.
Clinical decision support systems (CDSS) are developed primarily from knowledge gleaned from evidence-based research, guidelines, trusted resources and domain experts. While these resources generally represent information that is research proven, time-tested and consistent with current medical knowledge, they lack some qualities that would be desirable in a CDSS. For instance, the information is presented as generalized recommendations that are not specific to particular patients and may not consider certain subpopulations. In addition, the knowledge base that produces the guidelines may be outdated and may not reflect real-world practice. Ideally, resources for decision support should be timely, patient-specific, and represent current practice. Patient-oriented clinical decision support is particularly important in the practice of pediatrics because it addresses a population in constant flux. Every age represents a different set of physiological and developmental concerns and considerations, especially in medication dosing patterns. Patient clinical data warehouses (CDW) may be able to bridge the knowledge gap. CDWs contain the collective intelligence of various contributors (i.e. clinicians, administrators, etc.) where each data entry provides information regarding medical care for a patient in the real world. CDWs have the potential to provide information as current as the latest upload, be focused to specific subpopulations and reflect current clinical practice. In this paper, I study the potential of a well-known patient clinical data warehouse to provide information regarding pediatric levothyroxine dosing as a form of clinical decision support. I study the state of the stored data, the necessary data transformations and options for representing the data to effectively summarize and communicate the findings.
(cont.) I also compare the resulting transformed data, representing actual practice within this population, against established dosing recommendations. Of the transformed records, 728 of the 854 (85.2%, [95% confidence interval 82.7:87.6]) medication records contained doses that were under the published recommended range for levothyroxine. As demonstrated by these results, real world practice can diverge from established recommendations. Delivering this information on real-world peer practice medication dosing to clinicians in real-time offers the potential to provide a valuable supplement to established dosing guidelines, enhancing the general and sometimes static dosing recommendations.
by Elisabeth Lee Scheufele.
S.M.
Styles APA, Harvard, Vancouver, ISO, etc.
11

Malinowski, Gajda Elzbieta. « Designing conventional, spatial, and temporal data warehouses : concepts and methodological framework ». Doctoral thesis, Universite Libre de Bruxelles, 2006. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210837.

Texte intégral
Résumé :
Decision support systems are interactive, computer-based information systems that provide data and analysis tools in order to better assist managers on different levels of organization in the process of decision making. Data warehouses (DWs) have been developed and deployed as an integral part of decision support systems.

A data warehouse is a database that allows to store high volume of historical data required for analytical purposes. This data is extracted from operational databases, transformed into a coherent whole, and loaded into a DW during the extraction-transformation-loading (ETL) process.

DW data can be dynamically manipulated using on-line analytical processing (OLAP) systems. DW and OLAP systems rely on a multidimensional model that includes measures, dimensions, and hierarchies. Measures are usually numeric additive values that are used for quantitative evaluation of different aspects about organization. Dimensions provide different analysis perspectives while hierarchies allow to analyze measures on different levels of detail.

Nevertheless, currently, designers as well as users find difficult to specify multidimensional elements required for analysis. One reason for that is the lack of conceptual models for DW and OLAP system design, which would allow to express data requirements on an abstract level without considering implementation details. Another problem is that many kinds of complex hierarchies arising in real-world situations are not addressed by current DW and OLAP systems.

In order to help designers to build conceptual models for decision-support systems and to help users in better understanding the data to be analyzed, in this thesis we propose the MultiDimER model - a conceptual model used for representing multidimensional data for DW and OLAP applications. Our model is mainly based on the existing ER constructs, for example, entity types, attributes, relationship types with their usual semantics, allowing to represent the common concepts of dimensions, hierarchies, and measures. It also includes a conceptual classification of different kinds of hierarchies existing in real-world situations and proposes graphical notations for them.

On the other hand, currently users of DW and OLAP systems demand also the inclusion of spatial data, visualization of which allows to reveal patterns that are difficult to discover otherwise. The advantage of using spatial data in the analysis process is widely recognized since it allows to reveal patterns that are difficult to discover otherwise.

However, although DWs typically include a spatial or a location dimension, this dimension is usually represented in an alphanumeric format. Furthermore, there is still a lack of a systematic study that analyze the inclusion as well as the management of hierarchies and measures that are represented using spatial data.

With the aim of satisfying the growing requirements of decision-making users, we extend the MultiDimER model by allowing to include spatial data in the different elements composing the multidimensional model. The novelty of our contribution lays in the fact that a multidimensional model is seldom used for representing spatial data. To succeed with our proposal, we applied the research achievements in the field of spatial databases to the specific features of a multidimensional model. The spatial extension of a multidimensional model raises several issues, to which we refer in this thesis, such as the influence of different topological relationships between spatial objects forming a hierarchy on the procedures required for measure aggregations, aggregations of spatial measures, the inclusion of spatial measures without the presence of spatial dimensions, among others.

Moreover, one of the important characteristics of multidimensional models is the presence of a time dimension for keeping track of changes in measures. However, this dimension cannot be used to model changes in other dimensions.

Therefore, usual multidimensional models are not symmetric in the way of representing changes for measures and dimensions. Further, there is still a lack of analysis indicating which concepts already developed for providing temporal support in conventional databases can be applied and be useful for different elements composing a multidimensional model.

In order to handle in a similar manner temporal changes to all elements of a multidimensional model, we introduce a temporal extension for the MultiDimER model. This extension is based on the research in the area of temporal databases, which have been successfully used for modeling time-varying information for several decades. We propose the inclusion of different temporal types, such as valid and transaction time, which are obtained from source systems, in addition to the DW loading time generated in DWs. We use this temporal support for a conceptual representation of time-varying dimensions, hierarchies, and measures. We also refer to specific constraints that should be imposed on time-varying hierarchies and to the problem of handling multiple time granularities between source systems and DWs.

Furthermore, the design of DWs is not an easy task. It requires to consider all phases from the requirements specification to the final implementation including the ETL process. It should also take into account that the inclusion of different data items in a DW depends on both, users' needs and data availability in source systems. However, currently, designers must rely on their experience due to the lack of a methodological framework that considers above-mentioned aspects.

In order to assist developers during the DW design process, we propose a methodology for the design of conventional, spatial, and temporal DWs. We refer to different phases, such as requirements specification, conceptual, logical, and physical modeling. We include three different methods for requirements specification depending on whether users, operational data sources, or both are the driving force in the process of requirement gathering. We show how each method leads to the creation of a conceptual multidimensional model. We also present logical and physical design phases that refer to DW structures and the ETL process.

To ensure the correctness of the proposed conceptual models, i.e. with conventional data, with the spatial data, and with time-varying data, we formally define them providing their syntax and semantics. With the aim of assessing the usability of our conceptual model including representation of different kinds of hierarchies as well as spatial and temporal support, we present real-world examples. Pursuing the goal that the proposed conceptual solutions can be implemented, we include their logical representations using relational and object-relational databases.


Doctorat en sciences appliquées
info:eu-repo/semantics/nonPublished

Styles APA, Harvard, Vancouver, ISO, etc.
12

Hu, Yang. « Temporal Change in the Power Production of Real-world Photovoltaic Systems Under Diverse Climatic Conditions ». Case Western Reserve University School of Graduate Studies / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1481295879868785.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
13

Osop, Hamzah Bin. « A practice-based evidence approach for clinical decision support ». Thesis, Queensland University of Technology, 2018. https://eprints.qut.edu.au/123320/2/Hamzah%20Bin%20Osop%20Thesis.pdf.

Texte intégral
Résumé :
This thesis studies the conceptualisation and evaluation of a Practice-Based Evidence approach to decision making in healthcare. It examines the existing ICT architecture of a public hospital in Singapore to design a decision support system that leverages practical clinical evidence meaningfully captured in electronic health records. In doing so, healthcare professionals are supported in decision making through findings from past similar patients that can be generalised to the current patient population.
Styles APA, Harvard, Vancouver, ISO, etc.
14

Joaquim, Neto Cesar. « Análise de desempenho de consultas OLAP espaçotemporais em função da ordem de processamento dos predicados convencional, espacial e temporal ». Universidade Federal de São Carlos, 2016. https://repositorio.ufscar.br/handle/ufscar/8056.

Texte intégral
Résumé :
Submitted by Daniele Amaral (daniee_ni@hotmail.com) on 2016-10-07T20:05:05Z No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T19:30:58Z (GMT) No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-10-20T19:31:04Z (GMT) No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5)
Made available in DSpace on 2016-10-20T19:31:09Z (GMT). No. of bitstreams: 1 DissCJN.pdf: 5948964 bytes, checksum: e7e719e26b50a85697e7934bde411070 (MD5) Previous issue date: 2016-03-08
Não recebi financiamento
By providing ever-growing processing capabilities, many database technologies have been becoming important support tools to enterprises and institutions. The need to include (and control) new data types to the existing database technologies has brought also new challenges and research areas, arising the spatial, temporal, and spatiotemporal databases. Besides that, new analytical capabilities were required facilitating the birth of the data warehouse technology and, once more, the need to include spatial or temporal data (or both) to it, thus originating the spatial, temporal, and spatio-temporal data warehouses. The queries used in each database type had also evolved, culminating in the STOLAP (Spatio Temporal OLAP) queries, which are composed of predicates dealing with conventional, spatial, and temporal data with the possibility of having their execution aided by specialized index structures. This work’s intention is to investigate how the execution of each predicate affects the performance of STOLAP queries by varying the used indexes, their execution order and the query’s selectivity. Bitmap Join Indexes will help in conventional predicate’s execution and in some portions of the temporal processing, which will also count with the use of SQL queries for some of the alternatives used in this research. The SB-index and HSB-index will aid the spatial processing while the STB-index will be used to process temporal and spatial predicates together. The expected result is an analysis of the best predicate order while running the queries also considering their selectivity. Another contribution of this work is the evolution of the HSB-index to a hierarchized version called HSTB-index, which should complement the execution options.
Por proverem uma capacidade de processamento de dados cada vez maior, várias tecnologias de bancos de dados têm se tornado importantes ferramentas de apoio a empresas e instituições. A necessidade de se incluir e controlar novos tipos de dados aos bancos de dados já existentes fizeram também surgir novos desafios e novas linhas de pesquisa, como é o caso dos bancos de dados espaciais, temporais e espaçotemporais. Além disso, novas capacidades analíticas foram se fazendo necessárias culminando com o surgimento dos data warehouses e, mais uma vez, com a necessidade de se incluir dados espaciais e temporais (ou ambos) surgindo os data warehouses espaciais, temporais e espaço-temporais. As consultas relacionadas a cada tipo de banco de dados também evoluíram culminando com as consultas STOLAP (Spatio-Temporal OLAP) que são compostas basicamente por predicados envolvendo dados convencionais, espaciais e temporais e cujo processamento pode ser auxiliado por estruturas de indexação especializadas. Este trabalho pretende investigar como a execução de cada um dos tipos de predicados afeta o desempenho de consultas STOLAP variando-se os índices utilizados, a ordem de execução dos predicados e a seletividade das consultas. Índices Bitmap de Junção auxiliarão na execução dos predicados convencionais e de algumas partes dos predicados temporais que também contarão com o auxílio de consultas SQL, enquanto os índices SB-index e HSB-index serão utilizados para auxiliar na execução dos predicados espaciais das consultas. O STB-index também será utilizado nas comparações e envolve ambos os predicados espacial e temporal. Espera-se obter uma análise das melhores opções de combinação de execução dos predicados em consultas STOLAP tendo em vista também a seletividade das consultas. Outra contribuição deste trabalho é a evolução do HSB-index para uma versão hierarquizada chamada HSTB-index e que servirá para complementar as opções de processamento de consultas STOLAP.
Styles APA, Harvard, Vancouver, ISO, etc.
15

Goncy, Elizabeth A. « Conflict and Temporal and Relational Spillover of Conflict in Young Adult Romantic Relationships : Impact of Interparental and Parent-Child Relationships ». Kent State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=kent1310482081.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
16

Jouhet, Vianney. « Automated adaptation of Electronic Heath Record for secondary use in oncology ». Thesis, Bordeaux, 2016. http://www.theses.fr/2016BORD0373/document.

Texte intégral
Résumé :
Avec la montée en charge de l’informatisation des systèmes d’information hospitaliers, une quantité croissante de données est produite tout au long de la prise en charge des patients. L’utilisation secondaire de ces données constitue un enjeu essentiel pour la recherche ou l’évaluation en santé. Dans le cadre de cette thèse, nous discutons les verrous liés à la représentation et à la sémantique des données, qui limitent leur utilisation secondaire en cancérologie. Nous proposons des méthodes basées sur des ontologies pour l’intégration sémantique des données de diagnostics. En effet, ces données sont représentées par des terminologies hétérogènes. Nous étendons les modèles obtenus pour la représentation de la maladie tumorale, et les liens qui existent avec les diagnostics. Enfin, nous proposons une architecture combinant entrepôts de données, registres de métadonnées et web sémantique. L’architecture proposée permet l’intégration syntaxique et sémantique d’un grand nombre d’observations. Par ailleurs, l’intégration de données et de connaissances (sous la forme d’ontologies) a été utilisée pour construire un algorithme d’identification de la maladie tumorale en fonction des diagnostics présents dans les données de prise en charge. Cet algorithme basé sur les classes de l’ontologie est indépendant des données effectivement enregistrées. Ainsi, il fait abstraction du caractère hétérogène des données diagnostiques initialement disponibles. L’approche basée sur une ontologie pour l’identification de la maladie tumorale, permet une adaptation rapide des règles d’agrégation en fonction des besoins spécifiques d’identification. Ainsi, plusieurs versions du modèle d’identification peuvent être utilisées avec des granularités différentes
With the increasing adoption of Electronic Health Records (EHR), the amount of data produced at the patient bedside is rapidly increasing. Secondary use is there by an important field to investigate in order facilitate research and evaluation. In these work we discussed issues related to data representation and semantics within EHR that need to be address in order to facilitate secondary of structured data in oncology. We propose and evaluate ontology based methods for heterogeneous diagnosis terminologies integration in oncology. We then extend obtained model to enable tumoral disease representation and links with diagnosis as recorded in EHR. We then propose and implement a complete architecture combining a clinical data warehouse, a metadata registry and web semantic technologies and standards. This architecture enables syntactic and semantic integration of a broad range of hospital information System observation. Our approach links data with external knowledge (ontology), in order to provide a knowledge resource for an algorithm for tumoral disease identification based on diagnosis recorded within EHRs. As it based on the ontology classes, the identification algorithm is uses an integrated view of diagnosis (avoiding semantic heterogeneity). The proposed architecture leading to algorithm on the top of an ontology offers a flexible solution. Adapting the ontology, modifying for instance the granularity provide a way for adapting aggregation depending on specific needs
Styles APA, Harvard, Vancouver, ISO, etc.
17

SABAINI, Alberto. « Temporal Data Analysis and Mining. A Multidimensional Approach and its Application in a Medical Domain ». Doctoral thesis, 2015. http://hdl.handle.net/11562/911786.

Texte intégral
Résumé :
Il continuo aumento di dati disponibili in tutti i settori sta sollevando il bisogno dei decisori di effettuare sofisticate analisi per fronteggiare l'alta competitività che caraterrizza i giorni nostri. Diversi databases sono necessari per i decisori in modo da poter analizzare il comportamento e stato di un'azienda. Queste sorgenti di dati presentano spesso diversità in formati e contenuti. Integrare queste informazioni è vitale per supportare il processo decisionale. Una delle tecniche per far fronte a queste problematiche è il Data Warehousing. I Data Warehouse possono essere interrogati ed analizzati grazie a strumenti come l'Online Analytical Processing (OLAP) ed il Data Mining. Gli strumenti di supporto alle decisioni sono recentemente stati applicati al dominio medico. Questo interesse ha sollevato alcuni problemi relativi all'uso di modelli multidimensionali convenzionali. In paricolare, questi si sono rivelati insufficienti nel soddisfare i requisiti dei domini clinici in termini di rappresentazione a supporto temporale avanzato. Il tempo è una dimensione temporale importante, e come tale va adeguatamente modellato. I domini clinici sono caratterizzati da diversi aspetti temporali, tra cui l'inizio e fine di amministrazione di farmaci. In questa tesi ci occupiamo del design e dello sviluppo di una piattaforma di supporto alle decisioni per la farmacovigilanza. Questo sistema, chiamato VigiSegn, è stato creato nel contesto di un progetto di collaborazione con il Ministero della Salute Italiano sulla sorveglianza di farmaci in commercio sul territorio. Ci siamo focalizzati sulle necessità di esperti del dominio ed analisti. Queste necessità non erano soddisfatte dai tradizionali modelli multidimensionali. Abbiamo affrontato la modellazione avanzata di strutture dati, prestando particolare attenzione alle caraterristiche temporali dei dati. In questa tesi, definiamo formalmente un modello multidimensionale, da noi proposto, per la modellazione avanzata di fatti complessi. In particolare, ci siamo focalizzati sulla modellazione dell'interazione tra due cubi multidimensionali. Il modello è stato inoltre esteso in modo da sottolineare l'importanza dell'aspetto temporale in ambito clinico. Consideriamo semantiche temporali basate sia su punti che su intervalli. Affrontiamo anche il problema di inferire nuova informazione. Proponiamo un algoritmo di data mining per scoprire dipendenze funzionali temporali ed approssimate in ambito clinico.
The increasing amount of data available in all sectors is raising the need for decision makers to perform sophisticated analyses for dealing with today's high competitive world. Several databases are needed for decision-makers in order to be able to analyze an organization as a whole. These data sources are often scattered, and not uniform among each other in content and format. Their integration is crucial for the decision-making process, and advanced analyses are needed for such a crucial task. This problem may be solved by the data warehousing approach. Data warehouses can be queried and analyzed by means of Online Analytical Processing (OLAP) and Data Mining tools. Decision support systems have been recently dedicated to medical applications. Conventional multidimensional approaches prove not to suffice clinical domain requirements in terms of representation and advanced temporal support. Time is an important and pervasive concept of the real world that needs to be adequately modeled. Indeed, clinical domains are characterized by several temporal aspects. For instance, therapies may be characterized by a start, an end, a first drug administration dates, and so on. In this thesis we first deal with the design and development of a business intelligence solution for pharmacovigilance tasks. Such a system, called VigiSegn, has been created in the context of a project in collaboration the Italian Ministry of Health on drugs surveillance over the Italian territory. We focus on domain expert needs for analyzing and assessing suspected adverse drug reaction cases. Such needs were not satisfied by current data models. We address advanced modeling aspects for multidimensional structures by paying particular attention to data temporal features. We provide a formal definition of a multidimensional model for representing complex facts, addressing the issue of adequately represent interactions between multidimensional cubes. We provide a further extension of the proposed model by underlying the importance of considering both point-based and interval-based semantics when analyzing temporal data. This include advanced interval based temporal operations, and trend discovery. We also provide a sound data mining algorithm. The attention is focused on mining (approximate) temporal functional dependencies based on a temporal grouping of tuples.
Styles APA, Harvard, Vancouver, ISO, etc.
18

Tseng, Chin-shun, et 曾勁順. « A Framework of Object-Relational Data Warehouse for Clinical Data Integration ». Thesis, 2005. http://ndltd.ncl.edu.tw/handle/91352395391660316708.

Texte intégral
Résumé :
碩士
義守大學
資訊管理學系碩士班
93
In recent years, with the development of the medical informatics and the rapid change of the medical organization management environment, how to integrate effectively the intra of medical information in order to aid analysis the decision level has already become a new agitation of medical informatics. For this reason, a lot of medium-and-large-sized medical organizations have set about introducing the so-called clinical data warehouse system at every moment and hope to use the well-established data warehouse structure in the business world to meet the information demand for various medical decisions and analyses. However, the current data warehouse system is built upon the relation database, the star-schema is only suitable for dealing with the characters, numeral, and it is a multi-dimension statistical analysis of observing the field change of number value directly. With regard to a great deal of non-characters and non-numeral medical data, such as comprehensive image file for instance the X-ray, ECG, Ultrasound, CT and prescription, etc., it is unable to offer effectively organization, store, integration and analysis of heterogeneous data. For this reason, in this research, we propose a new data warehouse architecture based on the Object-Relational Database, and propose a data model which is suitable for the Object-Relational data warehouse. The feasibility of the proposed data model is illustrated with the construction of the clinical data warehouse and data mart over some disease instances.
Styles APA, Harvard, Vancouver, ISO, etc.
19

Zhou, X., B. Liu, X. Zhang, Q. Xie, R. Zhang, Y. Wang et Yonghong Peng. « Data mining in real-world traditional Chinese medicine clinical data warehouse ». 2014. http://hdl.handle.net/10454/10832.

Texte intégral
Résumé :
No
Real-world clinical setting is the major arena of traditional Chinese medicine (TCM) as it has experienced long-term practical clinical activities, and developed established theoretical knowledge and clinical solutions suitable for personalized treatment. Clinical phenotypes have been the most important features captured by TCM for diagnoses and treatment, which are diverse and dynamically changeable in real-world clinical settings. Together with clinical prescription with multiple herbal ingredients for treatment, TCM clinical activities embody immense valuable data with high dimensionalities for knowledge distilling and hypothesis generation. In China, with the curation of large-scale real-world clinical data from regular clinical activities, transforming the data to clinical insightful knowledge has increasingly been a hot topic in TCM field. This chapter introduces the application of data warehouse techniques and data mining approaches for utilizing real-world TCM clinical data, which is mainly from electronic medical records. The main framework of clinical data mining applications in TCM field is also introduced with emphasizing on related work in this field. The key points and issues to improve the research quality are discussed and future directions are proposed.
Styles APA, Harvard, Vancouver, ISO, etc.
20

Mantovani, Matteo. « Approximate Data Mining Techniques on Clinical Data ». Doctoral thesis, 2020. http://hdl.handle.net/11562/1018039.

Texte intégral
Résumé :
The past two decades have witnessed an explosion in the number of medical and healthcare datasets available to researchers and healthcare professionals. Data collection efforts are highly required, and this prompts the development of appropriate data mining techniques and tools that can automatically extract relevant information from data. Consequently, they provide insights into various clinical behaviors or processes captured by the data. Since these tools should support decision-making activities of medical experts, all the extracted information must be represented in a human-friendly way, that is, in a concise and easy-to-understand form. To this purpose, here we propose a new framework that collects different new mining techniques and tools proposed. These techniques mainly focus on two aspects: the temporal one and the predictive one. All of these techniques were then applied to clinical data and, in particular, ICU data from MIMIC III database. It showed the flexibility of the framework, which is able to retrieve different outcomes from the overall dataset. The first two techniques rely on the concept of Approximate Temporal Functional Dependencies (ATFDs). ATFDs have been proposed, with their suitable treatment of temporal information, as a methodological tool for mining clinical data. An example of the knowledge derivable through dependencies may be "within 15 days, patients with the same diagnosis and the same therapy usually receive the same daily amount of drug". However, current ATFD models are not analyzing the temporal evolution of the data, such as "For most patients with the same diagnosis, the same drug is prescribed after the same symptom". To this extent, we propose a new kind of ATFD called Approximate Pure Temporally Evolving Functional Dependencies (APEFDs). Another limitation of such kind of dependencies is that they cannot deal with quantitative data when some tolerance can be allowed for numerical values. In particular, this limitation arises in clinical data warehouses, where analysis and mining have to consider one or more measures related to quantitative data (such as lab test results and vital signs), concerning multiple dimensional (alphanumeric) attributes (such as patient, hospital, physician, diagnosis) and some time dimensions (such as the day since hospitalization and the calendar date). According to this scenario, we introduce a new kind of ATFD, named Multi-Approximate Temporal Functional Dependency (MATFD), which considers dependencies between dimensions and quantitative measures from temporal clinical data. These new dependencies may provide new knowledge as "within 15 days, patients with the same diagnosis and the same therapy receive a daily amount of drug within a fixed range". The other techniques are based on pattern mining, which has also been proposed as a methodological tool for mining clinical data. However, many methods proposed so far focus on mining of temporal rules which describe relationships between data sequences or instantaneous events, without considering the presence of more complex temporal patterns into the dataset. These patterns, such as trends of a particular vital sign, are often very relevant for clinicians. Moreover, it is really interesting to discover if some sort of event, such as a drug administration, is capable of changing these trends and how. To this extent, we propose a new kind of temporal patterns, called Trend-Event Patterns (TEPs), that focuses on events and their influence on trends that can be retrieved from some measures, such as vital signs. With TEPs we can express concepts such as "The administration of paracetamol on a patient with an increasing temperature leads to a decreasing trend in temperature after such administration occurs". We also decided to analyze another interesting pattern mining technique that includes prediction. This technique discovers a compact set of patterns that aim to describe the condition (or class) of interest. Our framework relies on a classification model that considers and combines various predictive pattern candidates and selects only those that are important to improve the overall class prediction performance. We show that our classification approach achieves a significant reduction in the number of extracted patterns, compared to the state-of-the-art methods based on minimum predictive pattern mining approach, while preserving the overall classification accuracy of the model. For each technique described above, we developed a tool to retrieve its kind of rule. All the results are obtained by pre-processing and mining clinical data and, as mentioned before, in particular ICU data from MIMIC III database.
Styles APA, Harvard, Vancouver, ISO, etc.
21

Oliveira, Vitor Hugo Fernandes. « Conceção de um data warehouse espácio-temporal para análise de trajetórias humanas ». Master's thesis, 2013. http://hdl.handle.net/10451/9874.

Texte intégral
Résumé :
Projeto de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2013
Com a evolução das tecnologias móveis à disposição dos utilizadores, tem ocorrido um aumento significativo do volume de dados produzidos a partir destes dispositivos. A disponibilização destas grandes quantidades de informação, por exemplo, sobre a localização de utilizadores móveis e respetivas trajetórias, potencia o conhecimento e o estudo sobre as atividades, preferências, padrões de comportamento e de mobilidade desses utilizadores no espaço e no tempo. De modo a extrair informação útil e relevante é fundamental a conceção de métodos adequados para o tratamento, análise, descoberta de conhecimento e prospeção de dados. Contudo, os dados existentes sobre a mobilidade humana apresentam ainda redundâncias, incoerências, pouca informação semântica e ainda são escassas as soluções de descoberta de conhecimento e algoritmos de prospeçção de dados especialmente concebidos para dados espácio-temporais. Neste projeto ´e proposto um modelo de um Data Warehouse Espácio-temporal de trajetórias humanas, assim como os processos necessários para o tratamento de dados e o seu enriquecimento com informação, tais como extração de pontos de estadia e um algoritmo para a descoberta de utilizadores semelhantes baseado em informação geográfica. Este modelo tem como finalidade criar as bases para a concretização de aplicações e algoritmos de deteção de comportamentos e atividades de utilizadores móveis, sendo testado num exemplo concreto, o conjunto de dados Geolife, para uma população de 182 utilizadores com cerca de 24 milhões pontos geolocalizados em trajetórias. Os resultados mostram que o sistema desenvolvido permite níveis de análise de grande complexidade, possibilitando simultaneamente uma grande flexibilidade para processamento analítico, apresentando a sua utilidade para processos de negócio como planeamento urbano, análise de tráfego e análise de perfil de utilizadores.
With the evolution of mobile technologies available to users, there has been an significant growth of the volume of data generated from these devices. The availability of these large quantities of information, for example, about the location of mobile users and their trajectories, enhances the knowledge and study on activities, preferences, behavior patterns and mobility of those users in both space and time. In order to extract useful and relevant information is critical to designing appropriate methods for processing, analysis, knowledge discovery and data mining. However, the existing data on human mobility have still redundancies, inconsistencies, poor semantic information and are still scarce solutions of knowledge discovery and data mining algorithms specially designed for this type of spatio-temporal data. This thesis proposes a model of a Spatio-Temporal DataWarehouse of human trajectories, as well processes required for data processing and enrichment with semantic information, such as extraction of stay points and an algorithm for finding similar users based on geographic information. This model aims to lay the groundwork for the development of applications and algorithms for detection of behaviors and activities of mobile users, being tested in a concrete example, the data set Geolife for a population of 182 users with about 24 million points geolocated trajectories. The results show that the developed system allows analysis levels of complexity, while allowing great flexibility for analytical processing, showing its usefulness for business processes such as urban planning, traffic analysis and users profile analysis.
Styles APA, Harvard, Vancouver, ISO, etc.
22

Tavazzi, Erica. « Exploiting the temporal dimension in clinical data mining ». Doctoral thesis, 2020. http://hdl.handle.net/11577/3359241.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
23

Cheng, Jay Jojo. « On identifying polycystic ovary syndrome in the Clinical Data Warehouse at Boston Medical Center ». Thesis, 2017. https://hdl.handle.net/2144/23764.

Texte intégral
Résumé :
INTRODUCTION: Polycystic ovary syndrome (PCOS) is characterized by hyperandrogenemia, oligoanovulation, and numerous ovarian cysts. Although the most common cause of female factor infertility, its characteristics and metabolic risks are difficult to study due to its heterogeneity. Additionally, ethnic-specific data is scarce. Hospital electronic medical records and the diverse patient population at Boston Medical Center (BMC) may provide an avenue for investigating the longitudinal nature of PCOS and its race-specific characteristics. OBJECTIVES: 1. Describe the Clinical Data Warehouse (CDW) dataset available for studying PCOS. 2. Develop an automated method for extracting ovarian features from written ultrasound reports. 3. Identify PCOS patients from their record of the three cardinal PCOS features. METHODS: Patients evaluated on at least one of the three cardinal PCOS features, between October 1, 2003 and September 30, 2015 were queried from the BMC CDW. This thesis describes methods for cleaning the data, as well as the development of an ultrasound classifier based on natural language processing techniques. RESULTS: On a validation set of 1000 random ultrasounds, the automatic ultrasound classifier had a recall and precision for the presence of PCOM, 99.0% and 94.2%, respectively. Overall, 2421 cases of PCOS were identified, with 1010 not receiving a diagnosis. Black patients had twice the odds of being underdiagnosed compared to White patients (OR: 2.09; 95% CI: 1.69–2.59). CONCLUSIONS: Ascertaining PCOS through the medical record offers advantages over self-reported PCOS, including documentation of disease and recorded measurements. In the future, this PCOS dataset can be used in conjunction with cardiovascular and metabolic outcomes for developing a predictive model.
Styles APA, Harvard, Vancouver, ISO, etc.
24

Dietrich, Georg. « Ad Hoc Information Extraction in a Clinical Data Warehouse with Case Studies for Data Exploration and Consistency Checks ». Doctoral thesis, 2019. https://nbn-resolving.org/urn:nbn:de:bvb:20-opus-184642.

Texte intégral
Résumé :
The importance of Clinical Data Warehouses (CDW) has increased significantly in recent years as they support or enable many applications such as clinical trials, data mining, and decision making. CDWs integrate Electronic Health Records which still contain a large amount of text data, such as discharge letters or reports on diagnostic findings in addition to structured and coded data like ICD-codes of diagnoses. Existing CDWs hardly support features to gain information covered in texts. Information extraction methods offer a solution for this problem but they have a high and long development effort, which can only be carried out by computer scientists. Moreover, such systems only exist for a few medical domains. This paper presents a method empowering clinicians to extract information from texts on their own. Medical concepts can be extracted ad hoc from e.g. discharge letters, thus physicians can work promptly and autonomously. The proposed system achieves these improvements by efficient data storage, preprocessing, and with powerful query features. Negations in texts are recognized and automatically excluded, as well as the context of information is determined and undesired facts are filtered, such as historical events or references to other persons (family history). Context-sensitive queries ensure the semantic integrity of the concepts to be extracted. A new feature not available in other CDWs is to query numerical concepts in texts and even filter them (e.g. BMI > 25). The retrieved values can be extracted and exported for further analysis. This technique is implemented within the efficient architecture of the PaDaWaN CDW and evaluated with comprehensive and complex tests. The results outperform similar approaches reported in the literature. Ad hoc IE determines the results in a few (milli-) seconds and a user friendly GUI enables interactive working, allowing flexible adaptation of the extraction. In addition, the applicability of this system is demonstrated in three real-world applications at the Würzburg University Hospital (UKW). Several drug trend studies are replicated: Findings of five studies on high blood pressure, atrial fibrillation and chronic renal failure can be partially or completely confirmed in the UKW. Another case study evaluates the prevalence of heart failure in inpatient hospitals using an algorithm that extracts information with ad hoc IE from discharge letters and echocardiogram report (e.g. LVEF < 45 ) and other sources of the hospital information system. This study reveals that the use of ICD codes leads to a significant underestimation (31%) of the true prevalence of heart failure. The third case study evaluates the consistency of diagnoses by comparing structured ICD-10-coded diagnoses with the diagnoses described in the diagnostic section of the discharge letter. These diagnoses are extracted from texts with ad hoc IE, using synonyms generated with a novel method. The developed approach can extract diagnoses from the discharge letter with a high accuracy and furthermore it can prove the degree of consistency between the coded and reported diagnoses
Die Bedeutung von Clinical Data Warehouses (CDW) hat in den letzten Jahren stark zugenommen, da sie viele Anwendungen wie klinische Studien, Data Mining und Entscheidungsfindung unterstützen oder ermöglichen. CDWs integrieren elektronische Patientenakten, die neben strukturierten und kodierten Daten wie ICD-Codes von Diagnosen immer noch sehr vielen Textdaten enthalten, sowie Arztbriefe oder Befundberichte. Bestehende CDWs unterstützen kaum Funktionen, um die in den Texten enthaltenen Informationen zu nutzen. Informationsextraktionsmethoden bieten zwar eine Lösung für dieses Problem, erfordern aber einen hohen und langen Entwicklungsaufwand, der nur von Informatikern durchgeführt werden kann. Außerdem gibt es solche Systeme nur für wenige medizinische Bereiche. Diese Arbeit stellt eine Methode vor, die es Ärzten ermöglicht, Informationen aus Texten selbstständig zu extrahieren. Medizinische Konzepte können ad hoc aus Texten (z. B. Arztbriefen) extrahiert werden, so dass Ärzte unverzüglich und autonom arbeiten können. Das vorgestellte System erreicht diese Verbesserungen durch effiziente Datenspeicherung, Vorverarbeitung und leistungsstarke Abfragefunktionen. Negationen in Texten werden erkannt und automatisch ausgeschlossen, ebenso wird der Kontext von Informationen bestimmt und unerwünschte Fakten gefiltert, wie z. B. historische Ereignisse oder ein Bezug zu anderen Personen (Familiengeschichte). Kontextsensitive Abfragen gewährleisten die semantische Integrität der zu extrahierenden Konzepte. Eine neue Funktion, die in anderen CDWs nicht verfügbar ist, ist die Abfrage numerischer Konzepte in Texten und sogar deren Filterung (z. B. BMI > 25). Die abgerufenen Werte können extrahiert und zur weiteren Analyse exportiert werden. Diese Technik wird innerhalb der effizienten Architektur des PaDaWaN-CDW implementiert und mit umfangreichen und aufwendigen Tests evaluiert. Die Ergebnisse übertreffen ähnliche Ansätze, die in der Literatur beschrieben werden. Ad hoc IE ermittelt die Ergebnisse in wenigen (Milli-)Sekunden und die benutzerfreundliche Oberfläche ermöglicht interaktives Arbeiten und eine flexible Anpassung der Extraktion. Darüber hinaus wird die Anwendbarkeit dieses Systems in drei realen Anwendungen am Universitätsklinikum Würzburg (UKW) demonstriert: Mehrere Medikationstrendstudien werden repliziert: Die Ergebnisse aus fünf Studien zu Bluthochdruck, Vorhofflimmern und chronischem Nierenversagen können in dem UKW teilweise oder vollständig bestätigt werden. Eine weitere Fallstudie bewertet die Prävalenz von Herzinsuffizienz in stationären Patienten in Krankenhäusern mit einem Algorithmus, der Informationen mit Ad-hoc-IE aus Arztbriefen, Echokardiogrammbericht und aus anderen Quellen des Krankenhausinformationssystems extrahiert (z. B. LVEF < 45). Diese Studie zeigt, dass die Verwendung von ICD-Codes zu einer signifikanten Unterschätzung (31%) der tatsächlichen Prävalenz von Herzinsuffizienz führt. Die dritte Fallstudie bewertet die Konsistenz von Diagnosen, indem sie strukturierte ICD-10-codierte Diagnosen mit den Diagnosen, die im Diagnoseabschnitt des Arztbriefes beschriebenen, vergleicht. Diese Diagnosen werden mit Ad-hoc-IE aus den Texten gewonnen, dabei werden Synonyme verwendet, die mit einer neuartigen Methode generiert werden. Der verwendete Ansatz kann Diagnosen mit hoher Genauigkeit aus Arztbriefen extrahieren und darüber hinaus den Grad der Übereinstimmung zwischen den kodierten und beschriebenen Diagnosen bestimmen
Styles APA, Harvard, Vancouver, ISO, etc.
25

Soares, Diogo Filipe Marques. « Learning predictive models from temporal three-way data using triclustering : applications in clinical data analysis ». Master's thesis, 2020. http://hdl.handle.net/10451/48139.

Texte intégral
Résumé :
Tese de mestrado, Ciência de Dados, Universidade de Lisboa, Faculdade de Ciências, 2020
O conceito de triclustering estende o conceito de biclustering para um espaço tridimensional, cujo o objetivo é encontrar subespaços coerentes em dados tridimensionais. Considerando dados com dimensão temporal, a necessidade de aprender padrões temporais interessantes e usá-los para aprender modelos preditivos efetivos e interpretáveis, despoleta necessidade em investigar novas metodologias para análise de dados tridimensionais. Neste trabalho, propomos duas metodologias para esse efeito. Na primeira metodologia, encontramos os melhores parâmetros a serem usados em triclustering para descobrir os melhores triclusters (conjuntos de objetos com um padrão coerente ao longo de um dado conjunto de pontos temporais) para que depois estes padrões sejam usados como features por um dos mais apropriados classificadores encontrados na literatura. Neste caso, propomos juntar o classificador com uma abordagem de triclustering temporal. Para isso, idealizámos um algoritmo de triclustering com uma restrição temporal, denominado TCtriCluster para desvendar triclusters temporalmente contínuos (constituídos por pontos temporais contínuos). Na segunda metodologia, adicionámos uma fase de biclustering para descobrir padrões nos dados estáticos (dados que não mudam ao longo do tempo) e juntá-los aos triclusters para melhorar o desempenho e a interpretabilidade dos modelos. Estas metodologias foram usadas para prever a necessidade de administração de ventilação não invasiva (VNI) em pacientes com Esclerose Lateral Amiotrófica (ELA). Neste caso de estudo, aprendemos modelos de prognóstico geral, para os dados de todos os pacientes, e modelos especializados, depois de feita uma estratificação dos pacientes em 3 grupos de progressão: Lentos, Neutros e Rápidos. Os resultados demonstram que, além de serem bastante equiparáveis e por vezes superiores quando comparados com os resultados obtidos por um classificador de alto desempenho (Random Forests), os nossos classificadores são capazes de refinar as previsões através das potencialidades da interpretabilidade do modelo. De facto, quando usados os triclusters (e biclusters) como previsores, estamos a promover o uso de padrões de progressão da doença altamente interpretáveis. Para além disso, quando usados para previsão de prognóstico em doentes com ELA, os nossos modelos preditivos interpretáveis desvendaram padrões clinicamente relevantes para um grupo específico de padrões de progressão da doença, ajudando os médicos a entender a elevada heterogeneidade da progressão da ELA. Os resultados mostram ainda que a restrição temporal tem impacto na melhoria da efetividade e preditividade dos modelos.
Triclustering extends biclustering to the three-dimensional space, aiming to find coherent subspaces in three-way data (sets of objects described by subsets of features in a subset of contexts). When the context is time, the need to learn interesting temporal patterns and use them to learn effective and interpretable predictive models triggers the need for new research methodologies to be used in three-way data analysis. In this work, we propose two approaches to learn predictive models from three-way data: 1) a triclustering-based classifier (considering just temporal data) and 2) a mixture of biclustering (with static data) and triclustering (with temporal data). In the first approach, we find the best triclustering parameters to uncover the best triclusters (sets of objects with a coherent pattern along a set of time-points) and then use these patterns as features in a state-of-the-art classifier. In the case of temporal data, we propose to couple the classifier with a temporal triclustering approach. With this aim, we devised a temporally constrained triclustering algorithm, termed TCtriCluster algorithm to mine time-contiguous triclusters. In the second approach, we extended the triclustering-based classifier with a biclustering task, where biclusters are discovered in static data (not changed over the time) and integrated with triclusters to improve performance and model explainability. The proposed methodologies were used to predict the need for non-invasive ventilation (NIV) in patients with Amyotrophic Lateral Sclerosis (ALS). In this case study, we learnt a general prognostic model from all patients data and specialized models after patient stratification into Slow, Neutral and Fast progressors. Our results show that besides comparable and sometimes outperforming results, when compared to a high performing random forest classifier, our predictive models enhance prediction with the potentialities of model interpretability. Indeed, when using triclusters (and biclusters) as predictors, we promoting the use of highly interpretable disease progression patterns. Furthermore, when used for prognostic prediction in ALS, our interpretable predictive models unravelled clinically relevant and group-specific disease progression patterns, helping clinicians to understand the high heterogeneity of ALS disease progression. Results further show that the temporal restriction is effective in improving the effectiveness of the predictive models.
Styles APA, Harvard, Vancouver, ISO, etc.
26

Lin, Sheng-Hui, et 林昇輝. « Data warehouse approach to build a decision-support platform for orthopedics based on clinical and academic requirements ». Thesis, 2009. http://ndltd.ncl.edu.tw/handle/19788328253537026226.

Texte intégral
Résumé :
碩士
臺北醫學大學
醫學資訊研究所
97
The continuous quality improvement has become a trend in the contemporary medical society, and that can be achieved by the specialty database implement. Decision-support system in the academic and clinical aspects are included in the process such continuous quality improvement. The database has its limitation in the decision-support due to deficiency of on-line analytic function. The data warehouse offers the sophisticated function for decision-support processes. However, the implement of data warehouse may face a lot of obstacles, included expensive cost and large personnel. We had previously established a database of orthopedics, which collected the patients’ data since 2002. The new system was constructed based on this specialty database, the knowledge architectures was build up via specialists committee and accreditation indicators. The major function was to generate sufficient information for decision-support process in the academic and clinical aspects. The execution efficiency of this system is more effective than database. The unique knowledge architecture can form a distinguishing feature of the department. The cost that saved from personnel and time reduced from reports generation for accreditation is remarkable. The stratification of web-based interface application can be assessed through questionnaires; the outcome is satisfactory as what we previously expected. The sophisticate function of the data warehouse is hard to express in a solitary department of the hospital, especially when they had already owned traditional database. The experience of this system construction can be useful as one option for upgrade of specialty database and a step forward to the goal of the continuous quality improvemen
Styles APA, Harvard, Vancouver, ISO, etc.
27

Carneiro, Brian Neil. « Clinical intelligence : definição de processos de ETL e DW ». Master's thesis, 2017. http://hdl.handle.net/1822/53104.

Texte intégral
Résumé :
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
O Centro Hospitalar do Porto (CHP) é considerado uma referência na área do transplante da córnea, tendo realizado até ao momento mais de 4000 transplantes. Por sua vez, a córnea é o tecido mais transplantado no mundo e é por norma, o principal método para recuperar de cegueira causada por doenças nesse mesmo tecido. Face à importância desta área para o CHP, surgiu a necessidade do estudo do processo de transplante da córnea através de uma solução de Clinical Intelligence (CI). A finalidade desta dissertação incidiu no desenvolvimento de uma solução de CI capaz de apoiar a decisão dos clínicos e gestores do CHP sobre o processo de transplante de córnea, não só na perspetiva dos dados inerentes aos utentes, mas também do próprio transplante e lista de espera. O protótipo de CI, como definido inicialmente, continha a componente de Business Intelligence (BI), com o foco na definição dos processos de extração, transformação e carregamento dos dados para o Data Warehouse (DW). Posteriormente surgiu a possibilidade de incorporar técnicas de Data Mining, (DM), que permitiram, sobretudo, efetuar previsões sobre as prioridades de cirurgia e do tempo de espera do utente. Para a conceção do protótipo de CI foram seguidas três metodologias: Design Science Research, como abordagem principal do desenvolvimento do trabalho; Kimball’s lifecycle para a elaboração do DW e o Cross-Industry Standard Process for Data Mining para o processo de DM. Na perspetiva de BI, o protótipo permite compreender as características inerentes aos procedimentos, diagnósticos, utentes e a relação entre eles. Para além disso, proporciona uma análise sobre o fluxo de entrada e saída dos utentes, bem como o tempo média de espera, em dias, entre os mesmos. Na perspetiva de DM foram criados modelos capazes de prever o tempo de espera de um utente assim como as prioridades dos procedimentos de cariz normal, cumprindo com os padrões de aceitação do CHP (Sensibilidade>= 0,85; Precisão>= 0,75). Os melhores modelos obtiveram valores de sensibilidade e acuidade de 95% e 83 % ou 93% e 82% respetivamente, para certas classes dos targets. Numa perspetiva global de CI, o protótipo assegura a integração e a qualidade dos dados, assim como a manipulação eficiente desses dados através de relatórios, contribuindo com informação otimizada para os clínicos e gestores do CHP. As integrações dos modelos de DM no BI proporcionam uma maior eficiência na monotorização do estado de saúde do utente e dos recursos logísticos e humanos do CHP. Em suma, foram desenvolvidos 32 relatórios de visualização, 42 métricas de negócio e 320 modelos de DM juntamente com três artigos científicos de forma a disseminar o trabalho desenvolvido.
The Hospital Center of Porto (CHP) is considered a reference in the field of corneal transplantation, and has performed 4000 transplants so far. The cornea is the most transplanted tissue in the world and is usually, the main method to recover from blindness caused by diseases in this tissue. Thus, the study of the corneal transplantation process through a Clinical Intelligence (CI) solution was considered a priority for the CHP. The purpose of this dissertation consisted on the development of a CI prototype capable of supporting the decision of the CHP physicians and managers regarding the corneal transplantation process, not only from the perspective of the patient’s information, but also from the transplant itself and the waiting list. The prototype, as defined from the beginning, contained the Business Intelligence (BI) component, focusing on the extraction, transformation and loading processes of the data into the Data Warehouse (DW). Afterwards, the possibility of incorporating Data Mining (DM) emerged, which allowed to make predictions regarding certain targets. Furthermore, the development of the CI prototype followed three methodologies: Design Science Research, as the main approach for the work development; Kimball's lifecycle for the development of the DW and the Cross-Industry Standard Process for Data Mining for the DM process. In the BI perspective, the CI prototype allows the understanding of the procedures, diagnoses, patients characteristics and the relationship between them. In addition, it provides an analysis of the inflow and outflow of patients, as well as the average waiting time, in days, between them. From the DM perspective, models capable of predicting the patient’s waiting time as well as the procedural priority were created. The best models obtained sensitivity and accuracy values of 95%- 83% or 93%- 82% respectively for certain target classes. From a global perspective, the CI prototype ensures the data integration and quality, as well as an efficient reporting of these data, hence contributing with optimized information for CHP’s physicians and managers. The integration of DM models into BI provide greater efficiency in monitoring the patient’s health as well as the logistics and human resources of the CHP. In summary, 29 visualization reports, 42 business metrics and 320 DM models were developed along with three scientific articles in order to disseminate the developed solution.
Styles APA, Harvard, Vancouver, ISO, etc.
28

Mehrabi, Saeed. « Advanced natural language processing and temporal mining for clinical discovery ». 2015. http://hdl.handle.net/1805/8895.

Texte intégral
Résumé :
Indiana University-Purdue University Indianapolis (IUPUI)
There has been vast and growing amount of healthcare data especially with the rapid adoption of electronic health records (EHRs) as a result of the HITECH act of 2009. It is estimated that around 80% of the clinical information resides in the unstructured narrative of an EHR. Recently, natural language processing (NLP) techniques have offered opportunities to extract information from unstructured clinical texts needed for various clinical applications. A popular method for enabling secondary uses of EHRs is information or concept extraction, a subtask of NLP that seeks to locate and classify elements within text based on the context. Extraction of clinical concepts without considering the context has many complications, including inaccurate diagnosis of patients and contamination of study cohorts. Identifying the negation status and whether a clinical concept belongs to patients or his family members are two of the challenges faced in context detection. A negation algorithm called Dependency Parser Negation (DEEPEN) has been developed in this research study by taking into account the dependency relationship between negation words and concepts within a sentence using the Stanford Dependency Parser. The study results demonstrate that DEEPEN, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs. Additionally, an NLP system consisting of section segmentation and relation discovery was developed to identify patients' family history. To assess the generalizability of the negation and family history algorithm, data from a different clinical institution was used in both algorithm evaluations.
Styles APA, Harvard, Vancouver, ISO, etc.
29

Mendes, Celso Rafael Clara. « Visão e Análise Temporal do Processo Clínico ». Master's thesis, 2016. http://hdl.handle.net/10316/99229.

Texte intégral
Résumé :
Relatório Final de Estágio do Mestrado em Engenharia Informática apresentado à Faculdade de Ciências e Tecnologia da Universidade de Coimbra.
Atualmente, processos clínicos de utentes podem ser consideravelmente extensos, contendo desde informação relativa a exames realizados até a problemas de saúde que possam existir. Atendendo à extensão dos dados que podem existir, a sua análise e visualização torna-se difícil. Adicionalmente, para além de a leitura e interpretação se tratar de um processo moroso, a informação pode ainda estar em diferentes documentos/ localizações, por vezes em relatórios e sem se encontrar totalmente formatada. Uma possível solução para facilitar, e em geral melhorar, o processo é a utilização de uma visualização que reduza o esforço cognitivo e permita mais rapidamente identificar e extrair a informação relevante e tirar conclusões. Neste projeto tenho como objetivo estudar e implementar a solução identificada, para que um profissional de saúde possa rapidamente e com qualidade visualizar a informação clínica e facilmente entender as necessidades do utente. Na linha do tempo o profissional clínico terá ao seu dispor diferentes formas de visualizar a informação, desde a visualização da situação clínica atual, análise do histórico clínico e até mesmo entender as necessidades clínicas que o utente possa vir a ter no futuro, tais como, vacinas, exames periódicos, etc.. De forma a enriquecer os dados disponíveis também se pretende fornecer ao profissional clínico informação sobre a possibilidade de um paciente poder vir a contrair uma determinada patologia. Neste documento é possível observar o trabalho desenvolvido neste âmbito e as vantagens que este projeto trará para a área em questão.
Patients’ clinical processes can be considerably extensive, containing information from clinical test results to diagnosed health issues. Taking in account the volume of data that might accumulate, visualising and analysing it becomes a complex task. On top of how slow the process of reading and interpreting the data is, the relevant information might be spread across distinct documents/locations and in unstructured formats. A possible solution to ease, and generally improve, this process is using a better visualization aimed at reducing the cognitive effort and speeding the identification and extraction of the relevant information as well as the conclusions that might follow. The objective of the project is the research and implementation of the proposed solution, empowering medical personnel with high quality and fast means of clinical information acquisition and analysis. A direct benefit is giving the medical professional a clear view of the patient’s state and help him better understand their needs. On the timeline, the clinical professional will have at their disposal a myriad of ways to visualize not only the current clinical data but the past history as well. Additional, possible future needs, such as vaccinations, periodic clinical tests, etc. With the objective of enriching the data available to the healthcare professionals, the visualization, will also contain indication of pathologies a patient might contract as a result of their medical history and lifestyle. This document details the efforts undertaken in this scope as well as exposing the resulting advantages to the healthcare field.
Styles APA, Harvard, Vancouver, ISO, etc.
30

Zeman, Philip Michael. « Feasibility of Multi-Component Spatio-Temporal Modeling of Cognitively Generated EEG Data and its Potential Application to Research in Functional Anatomy and Clinical Neuropathology ». Thesis, 2009. http://hdl.handle.net/1828/5010.

Texte intégral
Résumé :
This dissertation is a compendium of multiple research papers that, together, address two main objectives. The first objective and primary research question is to determine whether or not, through a procedure of independent component analysis (ICA)-based data mining, volume-domain validation, and source volume estimation, it is possible to construct a meaningful, objective, and informative model of brain activity from scalpacquired EEG data. Given that a methodology to construct such a model can be created, the secondary objective and research question investigated is whether or not the sources derived from the EEG data can be used to construct a model of complex brain function associated with the spatial navigation and the virtual Morris Water Task (vMWT). The assumptions of the signal and noise characteristics of scalp-acquired EEG data were discussed in the context of what is currently known about functional brain activity to identify appropriate characteristics by which to separate the activities comprising EEG data into parts. A new EEG analysis methodology was developed using both synthetic and real EEG data that encompasses novel algorithms for (1) data-mining of the EEG to obtain the activities of individual areas of the brain, (2) anatomical modeling of brain sources that provides information about the 3-dimensional volumes from which each of the activities separated from the EEG originates, and (3) validation of data mining results to determine if a source activity found via the data-mining step originates from a distinct modular unit inside the head or if it is an artefact. The methodology incorporating the algorithms developed was demonstrated for EEG data collected from study participants while they navigated a computer-based virtual maze environment. The brain activities of participants were meaningfully depicted via brain source volume estimation and representation of the activity relationships of multiple areas of the brain. A case study was used to demonstrate the analysis methodology as applied to the EEG of an individual person. In a second study, a group EEG dataset was investigated and activity relationships between areas of the brain for participants of the group study were individually depicted to show how brain activities of individuals can be compared to the group. The results presented in this dissertation support the conclusion that it is feasible to use ICA-based data mining to construct a physiological model of coordinated parts of the brain related to the vMWT from scalp-recorded EEG data. The methodology was successful in creating an objective and informative model of brain activity from EEG data. Furthermore, the evidence presented indicates that this methodology can be used to provide meaningful evaluation of the brain activities of individual persons and to make comparisons of individual persons against a group. In sum, the main contributions of this body of work are 5 fold. The technical contributions are: (1) a new data mining algorithm tailored for EEG, (2) an EEG component validation algorithm that identifies noise components via their poor representation in a head model, (3) a volume estimation algorithm that estimates the region in the brain from which each source waveform found via data mining originates, (4) a new procedure to study brain activities associated with spatial navigation. The main contribution of this work to the understanding of brain function is (5) evidence of specific functional systems within the brain that are used while persons participate in the vMWT paradigm (Livingstone and Skelton, 2007) examining spatial navigation.
Graduate
0541
0622
0623
Styles APA, Harvard, Vancouver, ISO, etc.
31

Nouri, Golmaei Sara. « Improving the Performance of Clinical Prediction Tasks by using Structured and Unstructured Data combined with a Patient Network ». Thesis, 2021. http://dx.doi.org/10.7912/C2/41.

Texte intégral
Résumé :
Indiana University-Purdue University Indianapolis (IUPUI)
With the increasing availability of Electronic Health Records (EHRs) and advances in deep learning techniques, developing deep predictive models that use EHR data to solve healthcare problems has gained momentum in recent years. The majority of clinical predictive models benefit from structured data in EHR (e.g., lab measurements and medications). Still, learning clinical outcomes from all possible information sources is one of the main challenges when building predictive models. This work focuses mainly on two sources of information that have been underused by researchers; unstructured data (e.g., clinical notes) and a patient network. We propose a novel hybrid deep learning model, DeepNote-GNN, that integrates clinical notes information and patient network topological structure to improve 30-day hospital readmission prediction. DeepNote-GNN is a robust deep learning framework consisting of two modules: DeepNote and patient network. DeepNote extracts deep representations of clinical notes using a feature aggregation unit on top of a state-of-the-art Natural Language Processing (NLP) technique - BERT. By exploiting these deep representations, a patient network is built, and Graph Neural Network (GNN) is used to train the network for hospital readmission predictions. Performance evaluation on the MIMIC-III dataset demonstrates that DeepNote-GNN achieves superior results compared to the state-of-the-art baselines on the 30-day hospital readmission task. We extensively analyze the DeepNote-GNN model to illustrate the effectiveness and contribution of each component of it. The model analysis shows that patient network has a significant contribution to the overall performance, and DeepNote-GNN is robust and can consistently perform well on the 30-day readmission prediction task. To evaluate the generalization of DeepNote and patient network modules on new prediction tasks, we create a multimodal model and train it on structured and unstructured data of MIMIC-III dataset to predict patient mortality and Length of Stay (LOS). Our proposed multimodal model consists of four components: DeepNote, patient network, DeepTemporal, and score aggregation. While DeepNote keeps its functionality and extracts representations of clinical notes, we build a DeepTemporal module using a fully connected layer stacked on top of a one-layer Gated Recurrent Unit (GRU) to extract the deep representations of temporal signals. Independent to DeepTemporal, we extract feature vectors of temporal signals and use them to build a patient network. Finally, the DeepNote, DeepTemporal, and patient network scores are linearly aggregated to fit the multimodal model on downstream prediction tasks. Our results are very competitive to the baseline model. The multimodal model analysis reveals that unstructured text data better help to estimate predictions than temporal signals. Moreover, there is no limitation in applying a patient network on structured data. In comparison to other modules, the patient network makes a more significant contribution to prediction tasks. We believe that our efforts in this work have opened up a new study area that can be used to enhance the performance of clinical predictive models.
Styles APA, Harvard, Vancouver, ISO, etc.
Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!

Vers la bibliographie