Journal articles on the topic 'Machine Learning, Bioinformatics, Rare Diseases, Healthcare'

To see the other types of publications on this topic, follow the link: Machine Learning, Bioinformatics, Rare Diseases, Healthcare.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 28 journal articles for your research on the topic 'Machine Learning, Bioinformatics, Rare Diseases, Healthcare.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Hauschild, Anne-Christin, Marta Lemanczyk, Julian Matschinske, Tobias Frisch, Olga Zolotareva, Andreas Holzinger, Jan Baumbach, and Dominik Heider. "Federated Random Forests can improve local performance of predictive models for various healthcare applications." Bioinformatics 38, no. 8 (February 9, 2022): 2278–86. http://dx.doi.org/10.1093/bioinformatics/btac065.

Full text
Abstract:
Abstract Motivation Limited data access has hindered the field of precision medicine from exploring its full potential, e.g. concerning machine learning and privacy and data protection rules. Our study evaluates the efficacy of federated Random Forests (FRF) models, focusing particularly on the heterogeneity within and between datasets. We addressed three common challenges: (i) number of parties, (ii) sizes of datasets and (iii) imbalanced phenotypes, evaluated on five biomedical datasets. Results The FRF outperformed the average local models and performed comparably to the data-centralized models trained on the entire data. With an increasing number of models and decreasing dataset size, the performance of local models decreases drastically. The FRF, however, do not decrease significantly. When combining datasets of different sizes, the FRF vastly improve compared to the average local models. We demonstrate that the FRF remain more robust and outperform the local models by analyzing different class-imbalances. Our results support that FRF overcome boundaries of clinical research and enables collaborations across institutes without violating privacy or legal regulations. Clinicians benefit from a vast collection of unbiased data aggregated from different geographic locations, demographics and other varying factors. They can build more generalizable models to make better clinical decisions, which will have relevance, especially for patients in rural areas and rare or geographically uncommon diseases, enabling personalized treatment. In combination with secure multi-party computation, federated learning has the power to revolutionize clinical practice by increasing the accuracy and robustness of healthcare AI and thus paving the way for precision medicine. Availability and implementation The implementation of the federated random forests can be found at https://featurecloud.ai/. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
2

R, Pooja M. "Application of Learning Approaches in Healthcare." International Journal of Advanced Medical Sciences and Technology 1, no. 3 (June 10, 2021): 1–2. http://dx.doi.org/10.35940/ijamst.b3005.061321.

Full text
Abstract:
The learning approaches in healthcare would aim at phenotyping the disease based on clinical as well as physiological characteristics as ideally disease is defined and diagnosed by a combination of clinical symptoms and physiologic abnormalities.The medicine today is advanced into new realm with the growth of applications of artificial intelligence and machine learning in healthcare. This is important as we will not be addressing the target population for a specific disease alone; rather predict the likely outcome of the related disease in an unknown population of interest with the knowledge gained. This is of utmost focus especially with rare diseases, the data for which are available in lower volumes. Further, prediction outcomes available at earlier stages are important to prepare points of care to handle disastrous outcomes resulting from the diseases.
APA, Harvard, Vancouver, ISO, and other styles
3

M R, Pooja. "Application of Learning Approaches in Healthcare." International Journal of Advanced Medical Sciences and Technology 1, no. 3 (June 10, 2021): 1–2. http://dx.doi.org/10.54105/ijamst.b3005.061321.

Full text
Abstract:
The learning approaches in healthcare would aim at phenotyping the disease based on clinical as well as physiological characteristics as ideally disease is defined and diagnosed by a combination of clinical symptoms and physiologic abnormalities. The medicine today is advanced into new realm with the growth of applications of artificial intelligence and machine learning in healthcare. This is important as we will not be addressing the target population for a specific disease alone; rather predict the likely outcome of the related disease in an unknown population of interest with the knowledge gained. This is of utmost focus especially with rare diseases, the data for which are available in lower volumes. Further, prediction outcomes available at earlier stages are important to prepare points of care to handle disastrous outcomes resulting from the diseases.
APA, Harvard, Vancouver, ISO, and other styles
4

Setty, Samarth Thonta, Marie-Pier Scott-Boyer, Tania Cuppens, and Arnaud Droit. "New Developments and Possibilities in Reanalysis and Reinterpretation of Whole Exome Sequencing Datasets for Unsolved Rare Diseases Using Machine Learning Approaches." International Journal of Molecular Sciences 23, no. 12 (June 18, 2022): 6792. http://dx.doi.org/10.3390/ijms23126792.

Full text
Abstract:
Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20–30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics.
APA, Harvard, Vancouver, ISO, and other styles
5

Yao, Junfeng, Wen Sun, Zhongquan Jian, Qingqiang Wu, and Xiaoli Wang. "Effective knowledge graph embeddings based on multidirectional semantics relations for polypharmacy side effects prediction." Bioinformatics 38, no. 8 (February 17, 2022): 2315–22. http://dx.doi.org/10.1093/bioinformatics/btac094.

Full text
Abstract:
Abstract Motivation Polypharmacy is the combined use of drugs for the treatment of diseases. However, it often shows a high risk of side effects. Due to unnecessary interactions of combined drugs, the side effects of polypharmacy increase the risk of disease and even lead to death. Thus, obtaining abundant and comprehensive information on the side effects of polypharmacy is a vital task in the healthcare industry. Early traditional methods used machine learning techniques to predict side effects. However, they often make costly efforts to extract features of drugs for prediction. Later, several methods based on knowledge graphs are proposed. They are reported to outperform traditional methods. However, they still show limited performance by failing to model complex relations of side effects among drugs. Results To resolve the above problems, we propose a novel model by further incorporating complex relations of side effects into knowledge graph embeddings. Our model can translate and transmit multidirectional semantics with fewer parameters, leading to better scalability in large-scale knowledge graphs. Experimental evaluation shows that our model outperforms state-of-the-art models in terms of the average area under the ROC and precision–recall curves. Availability and implementation Code and data are available at: https://github.com/galaxysunwen/MSTE-master.
APA, Harvard, Vancouver, ISO, and other styles
6

Kothari, Sonali, Shwetambari Chiwhane, Shruti Jain, and Malti Baghel. "Cancerous brain tumor detection using hybrid deep learning framework." Indonesian Journal of Electrical Engineering and Computer Science 26, no. 3 (June 1, 2022): 1651. http://dx.doi.org/10.11591/ijeecs.v26.i3.pp1651-1661.

Full text
Abstract:
Computational <span>models based on deep learning (DL) algorithms have multiple processing layers representing data at multiple levels of abstraction. Deep learning has exploded in popularity in recent years, particularly in medical image processing, medical image analysis, and bioinformatics. As a result, deep learning has effectively modified and strengthened the means of identification, prediction, and diagnosis in several healthcare fields, including pathology, brain tumours, lung cancer, the abdomen, cardiac, and retina. In general, brain tumours are among the most common and aggressive malignant tumour diseases, with a limited life span if diagnosed at a higher grade. After identifying the tumour, brain tumour grading is a crucial step in evaluating a successful treatment strategy. This research aims to propose a cancerous brain tumor detection and classification using deep learning. In this paper, numerous soft computing techniques and a deep learning model to summarise the pathophysiology of brain cancer, imaging modalities for brain cancer, and automated computer-assisted methods for brain cancer characterization is used. In the sense of machine learning and the deep learning model, paper has highlighted the association between brain cancer and other brain disorders such as epilepsy, stroke, Alzheimer's, Parkinson's, and Wilson's disease, leukoaraiosis, and other neurological disorders.</span>
APA, Harvard, Vancouver, ISO, and other styles
7

Prakash, PKS, Srinivas Chilukuri, Nikhil Ranade, and Shankar Viswanathan. "RareBERT: Transformer Architecture for Rare Disease Patient Identification using Administrative Claims." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 1 (May 18, 2021): 453–60. http://dx.doi.org/10.1609/aaai.v35i1.16122.

Full text
Abstract:
A rare disease is any disease that affects a very small percentage (1 in 1,500) of population. It is estimated that there are nearly 7,000 rare disease affecting 30 million patients in the U. S. alone. Most of the patients suffering from rare diseases experience multiple misdiagnoses and may never be diagnosed correctly. This is largely driven by the low prevalence of the disease that results in a lack of awareness among healthcare providers. There have been efforts from machine learning researchers to develop predictive models to help diagnose patients using healthcare datasets such as electronic health records and administrative claims. Most recently, transformer models have been applied to predict diseases BEHRT, G-BERT and Med-BERT. However, these have been developed specifically for electronic health records (EHR) and have not been designed to address rare disease challenges such as class imbalance, partial longitudinal data capture, and noisy labels. As a result, they deliver poor performance in predicting rare diseases compared with baselines. Besides, EHR datasets are generally confined to the hospital systems using them and do not capture a wider sample of patients thus limiting the availability of sufficient rare dis-ease patients in the dataset. To address these challenges, we introduced an extension of the BERT model tailored for rare disease diagnosis called RareBERT which has been trained on administrative claims datasets. RareBERT extends Med-BERT by including context embedding and temporal reference embedding. Moreover, we introduced a novel adaptive loss function to handle the class imbal-ance. In this paper, we show our experiments on diagnosing X-Linked Hypophosphatemia (XLH), a genetic rare disease. While RareBERT performs significantly better than the baseline models (79.9% AUPRC versus 30% AUPRC for Med-BERT), owing to the transformer architecture, it also shows its robustness in partial longitudinal data capture caused by poor capture of claims with a drop in performance of only 1.35% AUPRC, compared with 12% for Med-BERT and 33.0% for LSTM and 67.4% for boosting trees based baseline.
APA, Harvard, Vancouver, ISO, and other styles
8

Ahmad, Iftikhar, Muhammad Javed Iqbal, and Mohammad Basheri. "Biological Data Classification and Analysis Using Convolutional Neural Network." Journal of Medical Imaging and Health Informatics 10, no. 10 (October 1, 2020): 2459–65. http://dx.doi.org/10.1166/jmihi.2020.3179.

Full text
Abstract:
The size of data gathered from various ongoing biological and clinically studies is increasing at an exponential rate. The bio-inspired data mainly comprises of genes of DNA, protein and variety of proteomics and genetic diseases. Additionally, DNA microarray data is also available for early diagnosis and prediction of various types of cancer diseases. Interestingly, this data may store very vital information about genes, their structure and important biological function. The huge volume and constant increase in the extracted bio data has opened several challenges. Many bioinformatics and machine learning models have been developed but those fail to address key challenges presents in the efficient and accurate analysis of variety of complex biologically inspired data such as genetic diseases etc. The reliable and robust process of classifying the extracted data into different classes based on the information hidden in the sample data is also a very interesting and open problem. This research work mainly focuses to overcome major challenges in the accurate protein classification keeping in view of the success of deep learning models in natural language processing since it assumes the proteins sequences as a language. The learning ability and overall classification performance of the proposed system can be validated with deep learning classification models. The proposed system can have the superior ability to accurately classify the mentioned datasets than previous approaches and shows better results. The in-depth analysis of multifaceted biological data may also help in the early diagnosis of diseases that causes due to mutation of genes and to overcome arising challenges in the development of large-scale healthcare systems.
APA, Harvard, Vancouver, ISO, and other styles
9

Ahmad, Iftikhar, Muhammad Javed Iqbal, and Mohammad Basheri. "Biological Data Classification and Analysis Using Convolutional Neural Network." Journal of Medical Imaging and Health Informatics 10, no. 10 (October 1, 2020): 2459–65. http://dx.doi.org/10.1166/jmihi.2020.31792459.

Full text
Abstract:
The size of data gathered from various ongoing biological and clinically studies is increasing at an exponential rate. The bio-inspired data mainly comprises of genes of DNA, protein and variety of proteomics and genetic diseases. Additionally, DNA microarray data is also available for early diagnosis and prediction of various types of cancer diseases. Interestingly, this data may store very vital information about genes, their structure and important biological function. The huge volume and constant increase in the extracted bio data has opened several challenges. Many bioinformatics and machine learning models have been developed but those fail to address key challenges presents in the efficient and accurate analysis of variety of complex biologically inspired data such as genetic diseases etc. The reliable and robust process of classifying the extracted data into different classes based on the information hidden in the sample data is also a very interesting and open problem. This research work mainly focuses to overcome major challenges in the accurate protein classification keeping in view of the success of deep learning models in natural language processing since it assumes the proteins sequences as a language. The learning ability and overall classification performance of the proposed system can be validated with deep learning classification models. The proposed system can have the superior ability to accurately classify the mentioned datasets than previous approaches and shows better results. The in-depth analysis of multifaceted biological data may also help in the early diagnosis of diseases that causes due to mutation of genes and to overcome arising challenges in the development of large-scale healthcare systems.
APA, Harvard, Vancouver, ISO, and other styles
10

Cesario, Alfredo, Marika D’Oria, Riccardo Calvani, Anna Picca, Antonella Pietragalla, Domenica Lorusso, Gennaro Daniele, et al. "The Role of Artificial Intelligence in Managing Multimorbidity and Cancer." Journal of Personalized Medicine 11, no. 4 (April 19, 2021): 314. http://dx.doi.org/10.3390/jpm11040314.

Full text
Abstract:
Traditional healthcare paradigms rely on the disease-centered approach aiming at reducing human nature by discovering specific drivers and biomarkers that cause the advent and progression of diseases. This reductive approach is not always suitable to understand and manage complex conditions, such as multimorbidity and cancer. Multimorbidity requires considering heterogeneous data to tailor preventing and targeting interventions. Personalized Medicine represents an innovative approach to address the care needs of multimorbid patients considering relevant patient characteristics, such as lifestyle and individual preferences, in opposition to the more traditional “one-size-fits-all” strategy focused on interventions designed at the population level. Integration of omic (e.g., genomics) and non-strictly medical (e.g., lifestyle, the exposome) data is necessary to understand patients’ complexity. Artificial Intelligence can help integrate and manage heterogeneous data through advanced machine learning and bioinformatics algorithms to define the best treatment for each patient with multimorbidity and cancer. The experience of an Italian research hospital, leader in the field of oncology, may help to understand the multifaceted issue of managing multimorbidity and cancer in the framework of Personalized Medicine.
APA, Harvard, Vancouver, ISO, and other styles
11

Yaqoob, Abrar, Rabia Musheer Aziz, Navneet Kumar Verma, Praveen Lalwani, Akshara Makrariya, and Pavan Kumar. "A Review on Nature-Inspired Algorithms for Cancer Disease Prediction and Classification." Mathematics 11, no. 5 (February 21, 2023): 1081. http://dx.doi.org/10.3390/math11051081.

Full text
Abstract:
In the era of healthcare and its related research fields, the dimensionality problem of high-dimensional data is a massive challenge as it is crucial to identify significant genes while conducting research on diseases like cancer. As a result, studying new Machine Learning (ML) techniques for raw gene expression biomedical data is an important field of research. Disease detection, sample classification, and early disease prediction are all important analyses of high-dimensional biomedical data in the field of bioinformatics. Recently, machine-learning techniques have dramatically improved the analysis of high-dimension biomedical data sets. Nonetheless, researchers’ studies on biomedical data faced the challenge of vast dimensions, i.e., the vast features (genes) with a very low sample space. In this paper, two-dimensionality reduction methods, feature selection, and feature extraction are introduced with a systematic comparison of several dimension reduction techniques for the analysis of high-dimensional gene expression biomedical data. We presented a systematic review of some of the most popular nature-inspired algorithms and analyzed them. The paper is mainly focused on the original principles behind each of the algorithms and their applications for cancer classification and prediction from gene expression data. Lastly, the advantages and disadvantages of nature-inspired algorithms for biomedical data are evaluated. This review paper may guide researchers to choose the most effective algorithm for cancer classification and prediction for the satisfactory analysis of high-dimensional biomedical data.
APA, Harvard, Vancouver, ISO, and other styles
12

Battineni, Gopi, Mohmmad Amran Hossain, Nalini Chintalapudi, and Francesco Amenta. "A Survey on the Role of Artificial Intelligence in Biobanking Studies: A Systematic Review." Diagnostics 12, no. 5 (May 9, 2022): 1179. http://dx.doi.org/10.3390/diagnostics12051179.

Full text
Abstract:
Introduction: In biobanks, participants’ biological samples are stored for future research. The application of artificial intelligence (AI) involves the analysis of data and the prediction of any pathological outcomes. In AI, models are used to diagnose diseases as well as classify and predict disease risks. Our research analyzed AI’s role in the development of biobanks in the healthcare industry, systematically. Methods: The literature search was conducted using three digital reference databases, namely PubMed, CINAHL, and WoS. Guidelines for preferred reporting elements for systematic reviews and meta-analyses (PRISMA)-2020 in conducting the systematic review were followed. The search terms included “biobanks”, “AI”, “machine learning”, and “deep learning”, as well as combinations such as “biobanks with AI”, “deep learning in the biobanking field”, and “recent advances in biobanking”. Only English-language papers were included in the study, and to assess the quality of selected works, the Newcastle–Ottawa scale (NOS) was used. The good quality range (NOS ≥ 7) is only considered for further review. Results: A literature analysis of the above entries resulted in 239 studies. Based on their relevance to the study’s goal, research characteristics, and NOS criteria, we included 18 articles for reviewing. In the last decade, biobanks and artificial intelligence have had a relatively large impact on the medical system. Interestingly, UK biobanks account for the highest percentage of high-quality works, followed by Qatar, South Korea, Singapore, Japan, and Denmark. Conclusions: Translational bioinformatics probably represent a future leader in precision medicine. AI and machine learning applications to biobanking research may contribute to the development of biobanks for the utility of health services and citizens.
APA, Harvard, Vancouver, ISO, and other styles
13

Revel-Vilk, Shoshana, Gabriel Chodick, Varda Shalev, and Noga Gadir. "Study Design: Development of an Advanced Machine Learning Algorithm for the Early Diagnosis of Gaucher Disease Using Real-World Data." Blood 136, Supplement 1 (November 5, 2020): 13–14. http://dx.doi.org/10.1182/blood-2020-134414.

Full text
Abstract:
Background: Gaucher disease (GD) is a rare, autosomal recessive condition, characterized by deficiency of the lysosomal enzyme β-glucocerebrosidase. The main disease features are anemia, thrombocytopenia, hepato-splenomegaly and bone infarction, osteonecrosis, and pathological fractures. However, diagnosis of GD can be challenging, especially for non-specialists, owing to wide variability in age at presentation, non-specific features, severity and type of clinical manifestations, and lack of awareness of the early signs and symptoms of the disease. Delayed and misdiagnosis of GD may lead to irreversible bone disease, severe growth retardation, and high risk of bleeding; in rare cases, misdiagnosis may be life-threatening. Developing a system for early and accurate diagnosis of GD is thus an essential unmet need. The development of an algorithm for early diagnosis of patients with rare diseases such as GD may help reduce delays in diagnosis and enable prompt, appropriate initiation of therapy, earlier decision-making, prevent potentially irreversible morbidities and unnecessary tests (some invasive), reduce anxiety, and facilitate genetic counseling. This study aims to develop a predictive model for the accurate diagnosis of GD using machine learning based on real-world clinical data. Methods: This study will be comprised of three parts. Part 1, a retrospective observational database analysis, will use data from the electronic patient database of the Maccabi Healthcare Service (MHS), the second largest Health Maintenance Organization in Israel. The MHS includes 2.2 million health records from 25% of the Israeli population. Clinical records have been fully computerized for &gt;20 years and are fully integrated with automated central laboratory, digitized imaging and pharmacy purchase data. Patients with confirmed GD who have been enrolled in the MHS health plan for ≥1 year will be eligible for inclusion, with approximately 250 patients with GD expected to be enrolled. Using MHS data from patients with GD, the Gaucher Earlier Diagnosis Consensus (GED-C) scoring system, developed by a consensus panel using Delphi methodology on the signs and co-variables that may be important for the diagnosis of GD, will be evaluated and compared with alternative scores developed directly from clinical data based on supervised machine learning. In Part 2, a clinical study, the best performing modeled scores from Part 1 will be applied to the MHS database to identify individuals who may have undiagnosed GD ('GD suspects'). Samples for diagnostic testing (using a specific and sensitive biomarker (glucosylsphingosine, lyso-Gb1) followed by beta-glucocerebrosidase (GBA) genotyping for positive samples) will be collected from MHS biobank (for individuals who have consented). Individuals not participating in the biobank will be asked to provide a sample. This part of the study will evaluate the predictive value of the modeled scores, and assess the sensitivity and specificity of the model for the diagnosis of new patients with GD. In Part 3, analysis of data from newly diagnosed patients identified in Part 2 will be used to develop machine learning models for the diagnosis of GD (Figure 1). Signs and co-variables included in the GED-C score will be used, eliminating features that are non-informative. Features will be quantitative where possible, and interaction terms will be added for age of onset and trend for key features. A number of methods will be developed, with the best performing, based on its precision at a given sensitivity level, being selected as the final model. External validation of the best identified model is planned, to ensure unbiased estimate of the model's accuracy. Discussion: The main goal of the study is to develop an algorithm to help detect patients with GD, independent of physicians' ability to recognize signs and symptoms, using the application of machine learning to data from a large health database. The study is expected to result in a practical tool that will alert physicians to the possibility of GD. The resulting model will also improve our understanding of GD based on the relative importance of features for GD prediction. Such tools will have a positive impact on patient care and quality of life and on healthcare costs and may lead to a change in approach for diagnosing rare diseases. Disclosures Revel-Vilk: Takeda: Honoraria; sanofi-Genzyme: Honoraria; Pfizer: Honoraria. Chodick:Novartis Pharma AG: Other: Institutional grant. Gadir:Takeda: Current Employment.
APA, Harvard, Vancouver, ISO, and other styles
14

Talwar, Vineet, Kundan Singh Chufal, and Srujana Joga. "Artificial Intelligence: A New Tool in Oncologist's Armamentarium." Indian Journal of Medical and Paediatric Oncology 42, no. 06 (December 2021): 511–17. http://dx.doi.org/10.1055/s-0041-1735577.

Full text
Abstract:
AbstractArtificial intelligence (AI) has become an essential tool in human life because of its pivotal role in communications, transportation, media, and social networking. Inspired by the complex neuronal network and its functions in human beings, AI, using computer-based algorithms and training, had been explored since the 1950s. To tackle the enormous amount of patients' clinical data, imaging, histopathological data, and the increasing pace of research on new treatments and clinical trials, and ever-changing guidelines for treatment with the advent of novel drugs and evidence, AI is the need of the hour. There are numerous publications and active work on AI's role in the field of oncology. In this review, we discuss the fundamental terminology of AI, its applications in oncology on the whole, and its limitations. There is an inter-relationship between AI, machine learning and, deep learning. The virtual branch of AI deals with machine learning. While the physical branch of AI deals with the delivery of different forms of treatment—surgery, targeted drug delivery, and elderly care. The applications of AI in oncology include cancer screening, diagnosis (clinical, imaging, and histopathological), radiation therapy (image acquisition, tumor and organs at risk segmentation, image registration, planning, and delivery), prediction of treatment outcomes and toxicities, prediction of cancer cell sensitivity to therapeutics and clinical decision-making. A specific area of interest is in the development of effective drug combinations tailored to every patient and tumor with the help of AI. Radiomics, the new kid on the block, deals with the planning and administration of radiotherapy. As with any new invention, AI has its fallacies. The limitations include lack of external validation and proof of generalizability, difficulty in data access for rare diseases, ethical and legal issues, no precise logic behind the prediction, and last but not the least, lack of education and expertise among medical professionals. A collaboration between departments of clinical oncology, bioinformatics, and data sciences can help overcome these problems in the near future.
APA, Harvard, Vancouver, ISO, and other styles
15

Kujawski, Stephanie, Boshu Ru, Amar K. Das, Nelson L. Afanador, richard baumgartner, Zhiwen Liu, Shuang Lu, et al. "1344. Predicting Measles Outbreaks in the United States: Application of Different Modeling Approaches." Open Forum Infectious Diseases 8, Supplement_1 (November 1, 2021): S759. http://dx.doi.org/10.1093/ofid/ofab466.1536.

Full text
Abstract:
Abstract Background Although measles is still rare in the United States (U.S.), there have been recent resurgent outbreaks in the U.S. To improve the accuracy of prediction given the rarity of measles events, we used machine learning (ML) algorithms to model measles case predictions at the U.S. county level. Methods The main outcome was occurrence of ≥1 measles case at the U.S. county level. Two ML prediction models were developed (HDBSCAN, a clustering algorithm, and XGBoost, a gradient boosting algorithm) and compared with traditional logistic regression. We included 28 predictors in the following categories: sociodemographics, population statistics, measles vaccination coverage, healthcare access, and exposure to measles via international air travel. The models were trained on 2014 case data and validated on 2018 case data. Models were compared using area under the receiver operating curve (AUC), sensitivity, specificity, positive predictive value (PPV), and F2 score (combined measure of sensitivity and PPV). Results There were 667 measles cases in 2014 and 375 in 2018 in the U.S. We identified U.S. counties for 635 (95.2%) cases in 2014 and 366 (97.6%) cases in 2018 through published sources, corresponding to 81/3143 (2.6%) counties in 2014 and 64/3143 (2.0%) counties in 2018 with ≥1 measles case. HDBSCAN had the highest sensitivity (0.92), but lowest AUC (0.68) and PPV (0.04) (Table). XGBoost had the highest F2 score (0.49), best balance of sensitivity (0.72) and specificity (0.94), and AUC = 0.92. Logistic regression had high AUC (0.91) and specificity (1.00) but the lowest sensitivity (0.16). Conclusion Machine learning approaches outperformed logistic regression by maximizing sensitivity to predict counties with measles cases, an important criterion to consider to prevent or prepare for future outbreaks. XGBoost or logistic regression could be considered to maximize specificity. Prioritizing sensitivity versus specificity may depend on county resources, priorities, and measles risk. Different modeling approaches could be considered to optimize surveillance efforts and develop effective interventions for timely response. Disclosures Stephanie Kujawski, PhD MPH, Merck & Co., Inc. (Employee, Shareholder) Boshu Ru, Ph.D., Merck & Co. Kenilworth, NJ (NYSE: MRK) (Employee, Shareholder) Amar K. Das, MD, PhD, Merck (Employee) richard baumgartner, PhD, Merck (Employee) Shuang Lu, MBA, MS, Merck (Employee) Matthew Pillsbury, PhD, Merck & CO. (Employee, Shareholder) Joseph Lewnard, PhD, Merck (Consultant, Grant/Research Support) James H. Conway, MD, FAAP, GSK (Advisor or Review Panel member)Merck (Advisor or Review Panel member)Moderna (Advisor or Review Panel member)Pfizer (Advisor or Review Panel member)Sanofi Pasteur (Research Grant or Support) Manjiri D. Pawaskar, PhD, Merck & Co., Inc. (Employee, Shareholder)
APA, Harvard, Vancouver, ISO, and other styles
16

Akushevich, Igor, Carl V. Hill, and Heather E. Whitson. "LEVERAGING ANALYTIC METHODS TO EXPAND OPPORTUNITIES IN AGING-RELATED HEALTH DISPARITIES RESEARCH." Innovation in Aging 3, Supplement_1 (November 2019): S426. http://dx.doi.org/10.1093/geroni/igz038.1592.

Full text
Abstract:
Abstract The objective of the Symposium is to improve the understanding of how existing analytic methods and data can be leveraged to make progress in understanding the causes and mechanisms of health-related disparities in Alzheimer’s disease, related dementias and other prominent age-related diseases. Topics will cover a range of academic and administrative topics including: i) advanced analytic methods and modeling of health disparities with application to racial and geographic disparities in AD/ADRD; ii) the role of repeated anesthetic and surgical exposure in generation of disparities in AD/ADRD risk; iii) the nature of health disparities in cognitive aging as parallel to or distinct from health disparities in patterns of aging in other systems in the body; iv) recent advances in machine learning applied to large claims databases involving medical disparities; and v) geographic-related disparities in life expectancy across the U.S. A focus will be made on demonstrating how studies using established administrative data resources such as Medicare claims databases combined with innovative analytic approaches such as partitioning analyses, time-series based methods of projection and forecasting, and stochastic process models can be used to uncover previously overlooked or understudied aspects in this area of research. Analyses of such increasingly available large health datasets provides an opportunity to obtain nationally representative multiethnic results based on individual-level measures that reflect the real care-related and epidemiological processes ongoing in the U.S. healthcare system and allows the targetting of relatively rare diseases in relatively small population subgroups.
APA, Harvard, Vancouver, ISO, and other styles
17

Dutt, Yogesh, Ruby Dhiman, Tanya Singh, Arpana Vibhuti, Archana Gupta, Ramendra Pati Pandey, V. Samuel Raj, Chung-Ming Chang, and Anjali Priyadarshini. "The Association between Biofilm Formation and Antimicrobial Resistance with Possible Ingenious Bio-Remedial Approaches." Antibiotics 11, no. 7 (July 11, 2022): 930. http://dx.doi.org/10.3390/antibiotics11070930.

Full text
Abstract:
Biofilm has garnered a lot of interest due to concerns in various sectors such as public health, medicine, and the pharmaceutical industry. Biofilm-producing bacteria show a remarkable drug resistance capability, leading to an increase in morbidity and mortality. This results in enormous economic pressure on the healthcare sector. The development of biofilms is a complex phenomenon governed by multiple factors. Several attempts have been made to unravel the events of biofilm formation; and, such efforts have provided insights into the mechanisms to target for the therapy. Owing to the fact that the biofilm-state makes the bacterial pathogens significantly resistant to antibiotics, targeting pathogens within biofilm is indeed a lucrative prospect. The available drugs can be repurposed to eradicate the pathogen, and as a result, ease the antimicrobial treatment burden. Biofilm formers and their infections have also been found in plants, livestock, and humans. The advent of novel strategies such as bioinformatics tools in treating, as well as preventing, biofilm formation has gained a great deal of attention. Development of newfangled anti-biofilm agents, such as silver nanoparticles, may be accomplished through omics approaches such as transcriptomics, metabolomics, and proteomics. Nanoparticles’ anti-biofilm properties could help to reduce antimicrobial resistance (AMR). This approach may also be integrated for a better understanding of biofilm biology, guided by mechanistic understanding, virtual screening, and machine learning in silico techniques for discovering small molecules in order to inhibit key biofilm regulators. This stimulated research is a rapidly growing field for applicable control measures to prevent biofilm formation. Therefore, the current article discusses the current understanding of biofilm formation, antibiotic resistance mechanisms in bacterial biofilm, and the novel therapeutic strategies to combat biofilm-mediated infections.
APA, Harvard, Vancouver, ISO, and other styles
18

Maurits, M., T. Huizinga, M. Reinders, S. Raychaudhuri, E. Karlson, E. Van den Akker, and R. Knevel. "FRI0585 HIGH-THROUGHPUT METHODOLOGY FOR EMR-BASED IDENTIFICATION OF CLINICAL SUB-PHENOTYPES IN COMPLEX PATIENT POPULATIONS." Annals of the Rheumatic Diseases 79, Suppl 1 (June 2020): 897.2–897. http://dx.doi.org/10.1136/annrheumdis-2020-eular.3489.

Full text
Abstract:
Background:Heterogeneity in disease populations complicates discovery of risk factors. To identify risk factors for subpopulations of diseases, we need analytical methods that can deal with unidentified disease subgroups.Objectives:Inspired by successful approaches from the Big Data field, we developed a high-throughput approach to identify subpopulations within patients with heterogeneous, complex diseases using the wealth of information available in Electronic Medical Records (EMRs).Methods:We extracted longitudinal healthcare-interaction records coded by 1,853 PheCodes[1] of the 64,819 patients from the Boston’s Partners-Biobank. Through dimensionality reduction using t-SNE[2] we created a 2D embedding of 32,424 of these patients (set A). We then identified distinct clusters post-t-SNE using DBscan[3] and visualized the relative importance of individual PheCodes within them using specialized spectrographs. We replicated this procedure in the remaining 32,395 records (set B).Results:Summary statistics of both sets were comparable (Table 1).Table 1.Summary statistics of the total Partners Biobank dataset and the 2 partitions.Set-Aset-BTotalEntries12,200,31112,177,13124,377,442Patients32,42432,39564,819Patientyears369,546.33368,597.92738,144.2unique ICD codes25,05624,95326,305unique Phecodes1,8511,8531,853We found 284 clusters in set A and 295 in set B, of which 63.4% from set A could be mapped to a cluster in set B with a median (range) correlation of 0.24 (0.03 – 0.58).Clusters represented similar yet distinct clinical phenotypes; e.g. patients diagnosed with “other headache syndrome” were separated into four distinct clusters characterized by migraines, neurofibromatosis, epilepsy or brain cancer, all resulting in patients presenting with headaches (Fig. 1 & 2). Though EMR databases tend to be noisy, our method was also able to differentiate misclassification from true cases; SLE patients with RA codes clustered separately from true RA cases.Figure 1.Two dimensional representation of Set A generated using dimensionality reduction (tSNE) and clustering (DBScan).Figure 2.Phenotype Spectrographs (PheSpecs) of four clusters characterized by “Other headache syndromes”, driven by codes relating to migraine, epilepsy, neurofibromatosis or brain cancer.Conclusion:We have shown that EMR data can be used to identify and visualize latent structure in patient categorizations, using an approach based on dimension reduction and clustering machine learning techniques. Our method can identify misclassified patients as well as separate patients with similar problems into subsets with different associated medical problems. Our approach adds a new and powerful tool to aid in the discovery of novel risk factors in complex, heterogeneous diseases.References:[1] Denny, J.C. et al. Bioinformatics (2010)[2]van der Maaten et al. Journal of Machine Learning Research (2008)[3] Ester, M. et al. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. (1996)Disclosure of Interests:Marc Maurits: None declared, Thomas Huizinga Grant/research support from: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Consultant of: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Marcel Reinders: None declared, Soumya Raychaudhuri: None declared, Elizabeth Karlson: None declared, Erik van den Akker: None declared, Rachel Knevel: None declared
APA, Harvard, Vancouver, ISO, and other styles
19

Shang, Aijing, Imi Faghmous, Dan Drozd, and Pablo Katz. "COMMODORE Cohort: A Novel, Real-World, Noninterventional Cohort Study Using a Patient-Centered Approach to Evaluate the Safety and Effectiveness of C5 Inhibitors in Patients with Paroxysmal Nocturnal Hemoglobinuria." Blood 136, Supplement 1 (November 5, 2020): 31–32. http://dx.doi.org/10.1182/blood-2020-137454.

Full text
Abstract:
Background Paroxysmal nocturnal hemoglobinuria (PNH) is a life-threatening disease of dysregulated complement activation characterized by hemolysis and thrombosis and is associated with bone marrow failure. PNH can also impair patient quality of life and negatively impact the ability to work. PNH is a rare disease with an estimated incidence of 1 to 1.5 cases per million people globally. Inhibition of complement component 5 (C5) has been shown to reduce intravascular hemolysis, stabilize hemoglobin, reduce the need for blood transfusion, and improve quality of life for patients with PNH. C5 inhibitors eculizumab and ravulizumab are approved in the United States and other countries for treatment of PNH, yet there are limited data on the real-world use of these agents, especially in populations not eligible to participate in registrational clinical trials. Here we describe the COMMODORE Cohort study, which will use a novel patient-centered study design to collect both retrospective and prospective patient data on the real-world use, safety, and effectiveness of eculizumab and ravulizumab, as well as disease burden and outcomes, in patients with PNH in the United States. Study Design and Methods This noninterventional cohort study will collect data using the PicnicHealth digital personal health record platform. This platform uses a novel human-in-the-loop machine learning system to integrate, harmonize, and structure patient data, including clinical notes, medications, laboratory results, and diagnostic reports contained in medical records collected from any healthcare facility in the United States. This study was designed in collaboration with patient advocacy groups to ensure that data generated will answer questions important to the patient community. In contrast to many studies, patients will be directly recruited to participate through multiple avenues, including working with patient advocacy groups and societies as well as outreach through social media and other communication tools. All patients must complete an informed consent form to participate. Patient data is anonymized, and the study complies with the Health Insurance Portability and Accountability Act data security standards. The study will be submitted to IntegReview for institutional review board approval. Patients who report a diagnosis of PNH within the past 5 years and have subsequently been treated with eculizumab or ravulizumab can be included in this study (Figure). Data extracted by the platform will confirm that the patient meets study criteria. The study has 3 arms: arm A is comprised of patients who initiated therapy with eculizumab, arm B is patients who initiated therapy with ravulizumab, and arm C is patients who initiated therapy with eculizumab and later switched to ravulizumab. Exclusion criteria for patients in arms A and B includes treatment with a complement inhibitor prior to PNH diagnosis and treatment with eculizumab or ravulizumab for &gt; 25 weeks. Exclusion criteria for all patients include platelet count &lt; 30,000/μL, absolute neutrophil count &lt; 500/μL, and history of bone marrow transplant. The primary objective is to describe the proportion of patients who do not receive packed red blood cell transfusion. Secondary objectives are to determine the proportions of patients with breakthrough hemolysis, stabilized hemoglobin, and a thromboembolic event and the change in normalized lactate dehydrogenase. Safety objectives are to determine the rates and proportions of selected adverse events. Primary, secondary, and safety objectives will be evaluated from week 5 to 25 of treatment. The effectiveness and safety analyses will be conducted in all patients who fulfill entry criteria and have a minimum of 25 weeks of accrued person-time from treatment initiation. Exploratory analyses assessing long-term experiences and outcomes and will be conducted in all patients who fulfill entry criteria with no minimum treatment duration requirement. Descriptive statistics will be provided. Summary The COMMODORE Cohort study will use a novel, patient-centered approach to data generation including collaborating with patient groups to ensure that the study answers questions important to the PNH community. This approach may serve as a model for future studies evaluating other rare diseases with limited real-world data. Disclosures Shang: F. Hoffmann-La Roche Ltd: Current Employment, Current equity holder in publicly-traded company, Other: All authors received support for third party writing assistance, furnished by Scott Battle, PhD, provided by F. Hoffmann-La Roche, Basel, Switzerland.. Faghmous:Kite Pharma: Current Employment; F. Hoffmann-La Roche Ltd: Ended employment in the past 24 months, Other: All authors received medical writing support for this abstract, furnished by Scott Battle, funded by F. Hoffmann-La Roche Ltd, Basel, Switzerland. . Drozd:PicnicHealth: Current Employment, Current equity holder in private company; F. Hoffmann-La Roche Ltd: Other: All authors received support for third party writing assistance, furnished by Scott Battle, PhD, provided by F. Hoffmann-La Roche, Basel, Switzerland.. Katz:F. Hoffmann-La Roche Ltd: Current Employment, Other: All authors received support for third party writing assistance, furnished by Scott Battle, PhD, provided by F. Hoffmann-La Roche, Basel, Switzerland..
APA, Harvard, Vancouver, ISO, and other styles
20

Pressl, Christina, Caroline Jiang, Joel Correa da Rosa, Maximilian Friedrich, Winrich Freiwald, and Jonathan Tobin. "2093." Journal of Clinical and Translational Science 1, S1 (September 2017): 23. http://dx.doi.org/10.1017/cts.2017.93.

Full text
Abstract:
OBJECTIVES/SPECIFIC AIMS: We aim to examine the epidemiological characteristics of prosopagnosia by querying and analyzing a large deidentified clinical data set from 12 New York City-based hospitals and Federally Qualified Health Centers (FQHCs). The PCORI-funded New York City Clinical Data Research Network (NYC-CDRN) contains ~4.5 million deidentified ICD-coded electronic health records (EHRs) with comprehensive longitudinal information on demographics, patient visits, clinical conditions/diagnoses, laboratory and radiology results, medications, and clinical procedures. The NYC-CDRN will be expanded to include other data sources, including insurance claims, social determinant of health, patient reported outcomes, and patient generated data. The central hypothesis was that systematic mining of this database would reveal new epidemiological information about prosopagnosia. We developed a computable phenotype for prosopagnosia, using the International Classification of Diseases version 9 (ICD-9). The computable phenotype consisted of the diagnostic code for the condition under study, prosopagnosia (ICD-9 code 368.16), as well as the codes for known surrogate diagnoses. We expected to identify cases of acquired prosopagnosia, where the condition occurs only after brain damage, due to stroke, trauma, or meningitis for example, and cases of developmental prosopagnosia, where the condition is present from an early age, with no history of brain damage. The goals of this project were to provide new information about the condition’s prevalence rate in the New York City area, which could be furthermore translated into wider geographical areas and to yield novel details about its antecedents and comorbid conditions. METHODS/STUDY POPULATION: To determine the presence of the diagnosis of interest, prosopagnosia, and common co-occurring conditions among a New York City-based study population, we investigated a large database in collaboration with the NYC-CDRN. At the time the large database was mined it contained ~4 million ICD-9 coded EHRs. We first created a search paradigm; applicable for screening the database that consists of ICD-9 coded EHRs. We generated a list of ICD-9 codes indicative for the patients’ difficulties with the perception of faces (368.16), which indicates the presence of the condition as part of the psychophysical visual disturbances complex, and this code identified 871 patients. Furthermore, we collected codes that indicate the presence of conditions that are known to be surrogate diagnoses of prosopagnosia. ICD-9 codes for surrogate diagnoses included for example, 854.* (coding for personal history of traumatic brain injury, n=1,409), 434.01, 434.11, and 434.91 (coding for cerebral thrombosis, embolus and artery occlusion unspecified with cerebral infarction, n=19,409), and 191.2 (coding for malignant neoplasm of the temporal lobe, n=566). In October 2015, coding was changed to the new ICD-10 coding system. No additional patients were revealed from the data set when the cohort was searched for the presence of corresponding ICD-10 codes, as institutions are currently in transition from ICD-9 to ICD-10. Using this search query with the large database, we extracted novel information about the epidemiological and demographical distribution of prosopagnosia and furthermore, gained new knowledge about commonly associated diseases. The fact that it must be presumed that the majority of diagnoses of prosopagnosia have been made on the basis of patients’ self-reports and clinicians’ judgments represents a limiting factor in this study. We are currently exploring machine-learning strategies to identify potential false-negative cases among the patients with surrogate diagnoses. RESULTS/ANTICIPATED RESULTS: Investigations and application of our search query revealed a total number of n=129,549 patients carrying either the diagnosis code for prosopagnosia or the codes for the known surrogate diagnoses. There were 871 patients who carried the ICD-9 code 368.16, indicating the potential presence of prosopagnosia among other visual disturbances. Remaining patients (n=128,678) carried codes for known surrogate diagnoses, contained in the search query. Statistical analyses revealed elevated odds ratios for men (OR=1.55, 95% CI: 1.36, 1.77, p<0.0001), and for Black/African Americans Versus White individuals (OR=2.09, 95% CI: 1.74, 2.51, p<0.0001). DISCUSSION/SIGNIFICANCE OF IMPACT: Currently, the prevalence of prosopagnosia remains unknown. Face blind individuals are struggling to recognize their social contacts by their face only in every day life and are therefore prone to experience reduced quality of life. We searched the large NYC-based clinical database, containing more than 4.5 million deidentified ICD-coded health records, for cases of prosopagnosia to shed light into its prevalence and epidemiological characteristics. We furthermore, mined the database for cases carrying known surrogate diagnoses to explore the magnitude and characteristics of individuals potentially under increased risk. Our efforts address a great healthcare need, as they revealed new epidemiological knowledge of a vulnerable and understudied population. The results of this project reveal new insights into the epidemiological characteristics of prosopagnosia and its surrogate diagnoses, and demonstrate the feasibility of mining large clinical databases to identify rare clinical populations. Our results suggest the need for a more targeted diagnostic assessment of face perception abilities in populations under increased risk.
APA, Harvard, Vancouver, ISO, and other styles
21

Schaefer, Julia, Moritz Lehne, Josef Schepers, Fabian Prasser, and Sylvia Thun. "The use of machine learning in rare diseases: a scoping review." Orphanet Journal of Rare Diseases 15, no. 1 (June 9, 2020). http://dx.doi.org/10.1186/s13023-020-01424-6.

Full text
Abstract:
Abstract Background Emerging machine learning technologies are beginning to transform medicine and healthcare and could also improve the diagnosis and treatment of rare diseases. Currently, there are no systematic reviews that investigate, from a general perspective, how machine learning is used in a rare disease context. This scoping review aims to address this gap and explores the use of machine learning in rare diseases, investigating, for example, in which rare diseases machine learning is applied, which types of algorithms and input data are used or which medical applications (e.g., diagnosis, prognosis or treatment) are studied. Methods Using a complex search string including generic search terms and 381 individual disease names, studies from the past 10 years (2010–2019) that applied machine learning in a rare disease context were identified on PubMed. To systematically map the research activity, eligible studies were categorized along different dimensions (e.g., rare disease group, type of algorithm, input data), and the number of studies within these categories was analyzed. Results Two hundred eleven studies from 32 countries investigating 74 different rare diseases were identified. Diseases with a higher prevalence appeared more often in the studies than diseases with a lower prevalence. Moreover, some rare disease groups were investigated more frequently than to be expected (e.g., rare neurologic diseases and rare systemic or rheumatologic diseases), others less frequently (e.g., rare inborn errors of metabolism and rare skin diseases). Ensemble methods (36.0%), support vector machines (32.2%) and artificial neural networks (31.8%) were the algorithms most commonly applied in the studies. Only a small proportion of studies evaluated their algorithms on an external data set (11.8%) or against a human expert (2.4%). As input data, images (32.2%), demographic data (27.0%) and “omics” data (26.5%) were used most frequently. Most studies used machine learning for diagnosis (40.8%) or prognosis (38.4%) whereas studies aiming to improve treatment were relatively scarce (4.7%). Patient numbers in the studies were small, typically ranging from 20 to 99 (35.5%). Conclusion Our review provides an overview of the use of machine learning in rare diseases. Mapping the current research activity, it can guide future work and help to facilitate the successful application of machine learning in rare diseases.
APA, Harvard, Vancouver, ISO, and other styles
22

Labory, Justine, Gwendal Le Bideau, David Pratella, Jean-Elisée Yao, Samira Ait-El-Mkadem Saadi, Sylvie Bannwarth, Loubna El-Hami, Véronique Paquis-Fluckinger, and Silvia Bottini. "ABEILLE: a novel method for ABerrant Expression Identification empLoying machine Learning from RNA-sequencing data." Bioinformatics, September 5, 2022. http://dx.doi.org/10.1093/bioinformatics/btac603.

Full text
Abstract:
Abstract Motivation Current advances in omics technologies are paving the diagnosis of rare diseases proposing as a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) have demonstrated to yield potential pathogenic events. However popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts. Results Hence we developed ABEILLE (ABerrant Expression Identification empLoying machine LEarning from sequencing data) a variational autoencoder (VAE) based method for the identification of AGEs from the analysis of RNA-seq data without the need of replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated to each gene in order to stratify AGE by severity of aberration. We tested ABEILLE on semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates. Availability ABEILLE source code is freely available at : https://github.com/UCA-MSI/ABEILLE. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
23

Pati, Sarthak, Ujjwal Baid, Brandon Edwards, Micah Sheller, Shih-Han Wang, G. Anthony Reina, Patrick Foley, et al. "Federated learning enables big data for rare cancer boundary detection." Nature Communications 13, no. 1 (December 5, 2022). http://dx.doi.org/10.1038/s41467-022-33407-5.

Full text
Abstract:
AbstractAlthough machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing.
APA, Harvard, Vancouver, ISO, and other styles
24

Fernandes, Felipe, Ingridy Barbalho, Daniele Barros, Ricardo Valentim, César Teixeira, Jorge Henriques, Paulo Gil, and Mário Dourado Júnior. "Biomedical signals and machine learning in amyotrophic lateral sclerosis: a systematic review." BioMedical Engineering OnLine 20, no. 1 (June 15, 2021). http://dx.doi.org/10.1186/s12938-021-00896-2.

Full text
Abstract:
Abstract Introduction The use of machine learning (ML) techniques in healthcare encompasses an emerging concept that envisages vast contributions to the tackling of rare diseases. In this scenario, amyotrophic lateral sclerosis (ALS) involves complexities that are yet not demystified. In ALS, the biomedical signals present themselves as potential biomarkers that, when used in tandem with smart algorithms, can be useful to applications within the context of the disease. Methods This Systematic Literature Review (SLR) consists of searching for and investigating primary studies that use ML techniques and biomedical signals related to ALS. Following the definition and execution of the SLR protocol, 18 articles met the inclusion, exclusion, and quality assessment criteria, and answered the SLR research questions. Discussions Based on the results, we identified three classes of ML applications combined with biomedical signals in the context of ALS: diagnosis (72.22%), communication (22.22%), and survival prediction (5.56%). Conclusions Distinct algorithmic models and biomedical signals have been reported and present promising approaches, regardless of their classes. In summary, this SLR provides an overview of the primary studies analyzed as well as directions for the construction and evolution of technology-based research within the scope of ALS.
APA, Harvard, Vancouver, ISO, and other styles
25

Tisdale, Ainslie, Christine M. Cutillo, Ramaa Nathan, Pierantonio Russo, Bryan Laraway, Melissa Haendel, Douglas Nowak, et al. "The IDeaS initiative: pilot study to assess the impact of rare diseases on patients and healthcare systems." Orphanet Journal of Rare Diseases 16, no. 1 (October 22, 2021). http://dx.doi.org/10.1186/s13023-021-02061-3.

Full text
Abstract:
Abstract Background Rare diseases (RD) are a diverse collection of more than 7–10,000 different disorders, most of which affect a small number of people per disease. Because of their rarity and fragmentation of patients across thousands of different disorders, the medical needs of RD patients are not well recognized or quantified in healthcare systems (HCS). Methodology We performed a pilot IDeaS study, where we attempted to quantify the number of RD patients and the direct medical costs of 14 representative RD within 4 different HCS databases and performed a preliminary analysis of the diagnostic journey for selected RD patients. Results The overall findings were notable for: (1) RD patients are difficult to quantify in HCS using ICD coding search criteria, which likely results in under-counting and under-estimation of their true impact to HCS; (2) per patient direct medical costs of RD are high, estimated to be around three–fivefold higher than age-matched controls; and (3) preliminary evidence shows that diagnostic journeys are likely prolonged in many patients, and may result in progressive, irreversible, and costly complications of their disease Conclusions The results of this small pilot suggest that RD have high medical burdens to patients and HCS, and collectively represent a major impact to the public health. Machine-learning strategies applied to HCS databases and medical records using sentinel disease and patient characteristics may hold promise for faster and more accurate diagnosis for many RD patients and should be explored to help address the high unmet medical needs of RD patients.
APA, Harvard, Vancouver, ISO, and other styles
26

Hallowell, Nina, Shirlene Badger, Aurelia Sauerbrei, Christoffer Nellåker, and Angeliki Kerasidou. "“I don’t think people are ready to trust these algorithms at face value”: trust and the use of machine learning algorithms in the diagnosis of rare disease." BMC Medical Ethics 23, no. 1 (November 16, 2022). http://dx.doi.org/10.1186/s12910-022-00842-4.

Full text
Abstract:
Abstract Background As the use of AI becomes more pervasive, and computerised systems are used in clinical decision-making, the role of trust in, and the trustworthiness of, AI tools will need to be addressed. Using the case of computational phenotyping to support the diagnosis of rare disease in dysmorphology, this paper explores under what conditions we could place trust in medical AI tools, which employ machine learning. Methods Semi-structured qualitative interviews (n = 20) with stakeholders (clinical geneticists, data scientists, bioinformaticians, industry and patient support group spokespersons) who design and/or work with computational phenotyping (CP) systems. The method of constant comparison was used to analyse the interview data. Results Interviewees emphasized the importance of establishing trust in the use of CP technology in identifying rare diseases. Trust was formulated in two interrelated ways in these data. First, interviewees talked about the importance of using CP tools within the context of a trust relationship; arguing that patients will need to trust clinicians who use AI tools and that clinicians will need to trust AI developers, if they are to adopt this technology. Second, they described a need to establish trust in the technology itself, or in the knowledge it provides—epistemic trust. Interviewees suggested CP tools used for the diagnosis of rare diseases might be perceived as more trustworthy if the user is able to vouchsafe for the technology’s reliability and accuracy and the person using/developing them is trusted. Conclusion This study suggests we need to take deliberate and meticulous steps to design reliable or confidence-worthy AI systems for use in healthcare. In addition, we need to devise reliable or confidence-worthy processes that would give rise to reliable systems; these could take the form of RCTs and/or systems of accountability transparency and responsibility that would signify the epistemic trustworthiness of these tools. words 294.
APA, Harvard, Vancouver, ISO, and other styles
27

Dros, Jesper T., Isabelle Bos, Frank C. Bennis, Sytske Wiegersma, John Paget, Chiara Seghieri, Jaime Barrio Cortés, and Robert A. Verheij. "Detection of primary Sjögren’s syndrome in primary care: developing a classification model with the use of routine healthcare data and machine learning." BMC Primary Care 23, no. 1 (August 9, 2022). http://dx.doi.org/10.1186/s12875-022-01804-w.

Full text
Abstract:
Abstract Background Primary Sjögren’s Syndrome (pSS) is a rare autoimmune disease that is difficult to diagnose due to a variety of clinical presentations, resulting in misdiagnosis and late referral to specialists. To improve early-stage disease recognition, this study aimed to develop an algorithm to identify possible pSS patients in primary care. We built a machine learning algorithm which was based on combined healthcare data as a first step towards a clinical decision support system. Method Routine healthcare data, consisting of primary care electronic health records (EHRs) data and hospital claims data (HCD), were linked on patient level and consisted of 1411 pSS and 929,179 non-pSS patients. Logistic regression (LR) and random forest (RF) models were used to classify patients using age, gender, diseases and symptoms, prescriptions and GP visits. Results The LR and RF models had an AUC of 0.82 and 0.84, respectively. Many actual pSS patients were found (sensitivity LR = 72.3%, RF = 70.1%), specificity was 74.0% (LR) and 77.9% (RF) and the negative predictive value was 99.9% for both models. However, most patients classified as pSS patients did not have a diagnosis of pSS in secondary care (positive predictive value LR = 0.4%, RF = 0.5%). Conclusion This is the first study to use machine learning to classify patients with pSS in primary care using GP EHR data. Our algorithm has the potential to support the early recognition of pSS in primary care and should be validated and optimized in clinical practice. To further enhance the algorithm in detecting pSS in primary care, we suggest it is improved by working with experienced clinicians.
APA, Harvard, Vancouver, ISO, and other styles
28

Jamian, Lia, Lee Wheless, Leslie J. Crofford, and April Barnado. "Rule-based and machine learning algorithms identify patients with systemic sclerosis accurately in the electronic health record." Arthritis Research & Therapy 21, no. 1 (December 2019). http://dx.doi.org/10.1186/s13075-019-2092-7.

Full text
Abstract:
Abstract Background Systemic sclerosis (SSc) is a rare disease with studies limited by small sample sizes. Electronic health records (EHRs) represent a powerful tool to study patients with rare diseases such as SSc, but validated methods are needed. We developed and validated EHR-based algorithms that incorporate billing codes and clinical data to identify SSc patients in the EHR. Methods We used a de-identified EHR with over 3 million subjects and identified 1899 potential SSc subjects with at least 1 count of the SSc ICD-9 (710.1) or ICD-10-CM (M34*) codes. We randomly selected 200 as a training set for chart review. A subject was a case if diagnosed with SSc by a rheumatologist, dermatologist, or pulmonologist. We selected the following algorithm components based on clinical knowledge and available data: SSc ICD-9 and ICD-10-CM codes, positive antinuclear antibody (ANA) (titer ≥ 1:80), and a keyword of Raynaud’s phenomenon (RP). We performed both rule-based and machine learning techniques for algorithm development. Positive predictive values (PPVs), sensitivities, and F-scores (which account for PPVs and sensitivities) were calculated for the algorithms. Results PPVs were low for algorithms using only 1 count of the SSc ICD-9 code. As code counts increased, the PPVs increased. PPVs were higher for algorithms using ICD-10-CM codes versus the ICD-9 code. Adding a positive ANA and RP keyword increased the PPVs of algorithms only using ICD billing codes. Algorithms using ≥ 3 or ≥ 4 counts of the SSc ICD-9 or ICD-10-CM codes and ANA positivity had the highest PPV at 100% but a low sensitivity at 50%. The algorithm with the highest F-score of 91% was ≥ 4 counts of the ICD-9 or ICD-10-CM codes with an internally validated PPV of 90%. A machine learning method using random forests yielded an algorithm with a PPV of 84%, sensitivity of 92%, and F-score of 88%. The most important feature was RP keyword. Conclusions Algorithms using only ICD-9 codes did not perform well to identify SSc patients. The highest performing algorithms incorporated clinical data with billing codes. EHR-based algorithms can identify SSc patients across a healthcare system, enabling researchers to examine important outcomes.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography