Academic literature on the topic 'Genomics Big Data Engineering'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Genomics Big Data Engineering.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Genomics Big Data Engineering"

1

Lekić, Matea, Kristijan Rogić, Adrienn Boldizsár, Máté Zöldy, and Ádám Török. "Big Data in Logistics." Periodica Polytechnica Transportation Engineering 49, no. 1 (December 17, 2019): 60–65. http://dx.doi.org/10.3311/pptr.14589.

Full text
Abstract:
With certainty, we can say that we are in the process of a new big revolution that has its name, Big Data. Though the term was devised by scientists from the area such as astronomy and genomics, Big Data is everywhere. They are both a resource and a tool whose main task is to provide information. However, as far as it can help us better understand the world around us, depending on how they are managed and who controls them, they can take us in some other direction. Although the figures that bind to Big Data can seem enormous at this time, we must be aware that the amount of what we can collect and the process is always just a fraction of the information that really exists in the world (and around it). However, from something we have to start!
APA, Harvard, Vancouver, ISO, and other styles
2

Radha, K., and B. Thirumala Rao. "A Study on Big Data Techniques and Applications." International Journal of Advances in Applied Sciences 5, no. 2 (June 1, 2016): 101. http://dx.doi.org/10.11591/ijaas.v5.i2.pp101-108.

Full text
Abstract:
<p>We are living in on-Demand Digital Universe with data spread by users and organizations at a very high rate. This data is categorized as Big Data because of its Variety, Velocity, Veracity and Volume. This data is again classified into unstructured, semi-structured and structured. Large datasets require special processing systems; it is a unique challenge for academicians and researchers. Map Reduce jobs use efficient data processing techniques which are applied in every phases of Map Reduce such as Mapping, Combining, Shuffling, Indexing, Grouping and Reducing. Big Data has essential characteristics as follows Variety, Volume and Velocity, Viscosity, Virality. Big Data is one of the current and future research frontiers. In many areas Big Data is changed such as public administration, scientific research, business, The Financial Services Industry, Automotive Industry, Supply Chain, Logistics, and Industrial Engineering, Retail, Entertainment, etc. Other Big Data applications are exist in atmospheric science, astronomy, medicine, biologic, biogeochemistry, genomics and interdisciplinary and complex researches. This paper is presents the Essential Characteristics of Big Data Applications and State of-the-art tools and techniques to handle data-intensive applications and also building index for web pages available online and see how Map and Reduce functions can be executed by considering input as a set of documents.</p><p> </p>
APA, Harvard, Vancouver, ISO, and other styles
3

Gut, Philipp, Sven Reischauer, Didier Y. R. Stainier, and Rima Arnaout. "Little Fish, Big Data: Zebrafish as a Model for Cardiovascular and Metabolic Disease." Physiological Reviews 97, no. 3 (July 1, 2017): 889–938. http://dx.doi.org/10.1152/physrev.00038.2016.

Full text
Abstract:
The burden of cardiovascular and metabolic diseases worldwide is staggering. The emergence of systems approaches in biology promises new therapies, faster and cheaper diagnostics, and personalized medicine. However, a profound understanding of pathogenic mechanisms at the cellular and molecular levels remains a fundamental requirement for discovery and therapeutics. Animal models of human disease are cornerstones of drug discovery as they allow identification of novel pharmacological targets by linking gene function with pathogenesis. The zebrafish model has been used for decades to study development and pathophysiology. More than ever, the specific strengths of the zebrafish model make it a prime partner in an age of discovery transformed by big-data approaches to genomics and disease. Zebrafish share a largely conserved physiology and anatomy with mammals. They allow a wide range of genetic manipulations, including the latest genome engineering approaches. They can be bred and studied with remarkable speed, enabling a range of large-scale phenotypic screens. Finally, zebrafish demonstrate an impressive regenerative capacity scientists hope to unlock in humans. Here, we provide a comprehensive guide on applications of zebrafish to investigate cardiovascular and metabolic diseases. We delineate advantages and limitations of zebrafish models of human disease and summarize their most significant contributions to understanding disease progression to date.
APA, Harvard, Vancouver, ISO, and other styles
4

Kennedy, Paul J., Daniel R. Catchpoole, Siamak Tafavogh, Bronwyn L. Harvey, and Ahmad A. Aloqaily. "Feature prioritisation on big genomic data for analysing gene-gene interactions." International Journal of Bioinformatics Research and Applications 17, no. 2 (2021): 158. http://dx.doi.org/10.1504/ijbra.2021.10037182.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Aloqaily, Ahmad A., Siamak Tafavogh, Bronwyn L. Harvey, Daniel R. Catchpoole, and Paul J. Kennedy. "Feature prioritisation on big genomic data for analysing gene-gene interactions." International Journal of Bioinformatics Research and Applications 17, no. 2 (2021): 158. http://dx.doi.org/10.1504/ijbra.2021.114420.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Chan, Jireh Yi-Le, Steven Mun Hong Leow, Khean Thye Bea, Wai Khuen Cheng, Seuk Wai Phoong, Zeng-Wei Hong, and Yen-Lin Chen. "Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review." Mathematics 10, no. 8 (April 12, 2022): 1283. http://dx.doi.org/10.3390/math10081283.

Full text
Abstract:
Technologies have driven big data collection across many fields, such as genomics and business intelligence. This results in a significant increase in variables and data points (observations) collected and stored. Although this presents opportunities to better model the relationship between predictors and the response variables, this also causes serious problems during data analysis, one of which is the multicollinearity problem. The two main approaches used to mitigate multicollinearity are variable selection methods and modified estimator methods. However, variable selection methods may negate efforts to collect more data as new data may eventually be dropped from modeling, while recent studies suggest that optimization approaches via machine learning handle data with multicollinearity better than statistical estimators. Therefore, this study details the chronological developments to mitigate the effects of multicollinearity and up-to-date recommendations to better mitigate multicollinearity.
APA, Harvard, Vancouver, ISO, and other styles
7

Yan, Hong. "Coclustering of Multidimensional Big Data: A Useful Tool for Genomic, Financial, and Other Data Analysis." IEEE Systems, Man, and Cybernetics Magazine 3, no. 2 (April 2017): 23–30. http://dx.doi.org/10.1109/msmc.2017.2664218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Shandilya, Shishir K., S. Sountharrajan, Smita Shandilya, and E. Suganya. "Big Data Analytics Framework for Real-Time Genome Analysis: A Comprehensive Approach." Journal of Computational and Theoretical Nanoscience 16, no. 8 (August 1, 2019): 3419–27. http://dx.doi.org/10.1166/jctn.2019.8302.

Full text
Abstract:
Big Data Technologies are well-accepted in the recent years in bio-medical and genome informatics. They are capable to process gigantic and heterogeneous genome information with good precision and recall. With the quick advancements in computation and storage technologies, the cost of acquiring and processing the genomic data has decreased significantly. The upcoming sequencing platforms will produce vast amount of data, which will imperatively require high-performance systems for on-demand analysis with time-bound efficiency. Recent bio-informatics tools are capable of utilizing the novel features of Hadoop in a more flexible way. In particular, big data technologies such as MapReduce and Hive are able to provide high-speed computational environment for the analysis of petabyte scale datasets. This has attracted the focus of bio-scientists to use the big data applications to automate the entire genome analysis. The proposed framework is designed over MapReduce and Java on extended Hadoop platform to achieve the parallelism of Big Data Analysis. It will assist the bioinformatics community by providing a comprehensive solution for Descriptive, Comparative, Exploratory, Inferential, Predictive and Causal Analysis on Genome data. The proposed framework is user-friendly, fully-customizable, scalable and fit for comprehensive real-time genome analysis from data acquisition till predictive sequence analysis.
APA, Harvard, Vancouver, ISO, and other styles
9

Shi, Daoyuan, and Lynn Kuo. "VARIABLE SELECTION FOR BAYESIAN SURVIVAL MODELS USING BREGMAN DIVERGENCE MEASURE." Probability in the Engineering and Informational Sciences 34, no. 3 (June 22, 2018): 364–80. http://dx.doi.org/10.1017/s0269964818000190.

Full text
Abstract:
The variable selection has been an important topic in regression and Bayesian survival analysis. In the era of rapid development of genomics and precision medicine, the topic is becoming more important and challenging. In addition to the challenges of handling censored data in survival analysis, we are facing increasing demand of handling big data with too many predictors where most of them may not be relevant to the prediction of the survival outcome. With the desire of improving upon the accuracy of prediction, we explore the Bregman divergence criterion in selecting predictive models. We develop sparse Bayesian formulation for parametric regression and semiparametric regression models and demonstrate how variable selection is done using the predictive approach. Model selections for a simulated data set, and two real-data sets (one for a kidney transplant study, and the other for a breast cancer microarray study at the Memorial Sloan-Kettering Cancer Center) are carried out to illustrate our methods.
APA, Harvard, Vancouver, ISO, and other styles
10

Ullah, Mohammad Asad, Muhammad-Redha Abdullah-Zawawi, Rabiatul-Adawiah Zainal-Abidin, Noor Liyana Sukiran, Md Imtiaz Uddin, and Zamri Zainal. "A Review of Integrative Omic Approaches for Understanding Rice Salt Response Mechanisms." Plants 11, no. 11 (May 27, 2022): 1430. http://dx.doi.org/10.3390/plants11111430.

Full text
Abstract:
Soil salinity is one of the most serious environmental challenges, posing a growing threat to agriculture across the world. Soil salinity has a significant impact on rice growth, development, and production. Hence, improving rice varieties’ resistance to salt stress is a viable solution for meeting global food demand. Adaptation to salt stress is a multifaceted process that involves interacting physiological traits, biochemical or metabolic pathways, and molecular mechanisms. The integration of multi-omics approaches contributes to a better understanding of molecular mechanisms as well as the improvement of salt-resistant and tolerant rice varieties. Firstly, we present a thorough review of current knowledge about salt stress effects on rice and mechanisms behind rice salt tolerance and salt stress signalling. This review focuses on the use of multi-omics approaches to improve next-generation rice breeding for salinity resistance and tolerance, including genomics, transcriptomics, proteomics, metabolomics and phenomics. Integrating multi-omics data effectively is critical to gaining a more comprehensive and in-depth understanding of the molecular pathways, enzyme activity and interacting networks of genes controlling salinity tolerance in rice. The key data mining strategies within the artificial intelligence to analyse big and complex data sets that will allow more accurate prediction of outcomes and modernise traditional breeding programmes and also expedite precision rice breeding such as genetic engineering and genome editing.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Genomics Big Data Engineering"

1

Goldstein, Theodore C. "Tools for extracting actionable medical knowledge from genomic big data." Thesis, University of California, Santa Cruz, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3589324.

Full text
Abstract:

Cancer is an ideal target for personal genomics-based medicine that uses high-throughput genome assays such as DNA sequencing, RNA sequencing, and expression analysis (collectively called omics); however, researchers and physicians are overwhelmed by the quantities of big data from these assays and cannot interpret this information accurately without specialized tools. To address this problem, I have created software methods and tools called OCCAM (OmiC data Cancer Analytic Model) and DIPSC (Differential Pathway Signature Correlation) for automatically extracting knowledge from this data and turning it into an actionable knowledge base called the activitome. An activitome signature measures a mutation's effect on the cellular molecular pathway. As well, activitome signatures can also be computed for clinical phenotypes. By comparing the vectors of activitome signatures of different mutations and clinical outcomes, intrinsic relationships between these events may be uncovered. OCCAM identifies activitome signatures that can be used to guide the development and application of therapies. DIPSC overcomes the confounding problem of correlating multiple activitome signatures from the same set of samples. In addition, to support the collection of this big data, I have developed MedBook, a federated distributed social network designed for a medical research and decision support system. OCCAM and DIPSC are two of the many apps that will operate inside of MedBook. MedBook extends the Galaxy system with a signature database, an end-user oriented application platform, a rich data medical knowledge-publishing model, and the Biomedical Evidence Graph (BMEG). The goal of MedBook is to improve the outcomes by learning from every patient.

APA, Harvard, Vancouver, ISO, and other styles
2

Miller, Chase Allen. "Towards a Web-Based, Big Data, Genomics Ecosystem." Thesis, Boston College, 2014. http://hdl.handle.net/2345/bc-ir:104052.

Full text
Abstract:
Thesis advisor: Gabor T. Marth
Rapid advances in genome sequencing enable a wide range of biological experiments on a scale that was until recently restricted to large genome centers. However, the analysis of the resulting vast genomic datasets is time-consuming, unintuitive and requires considerable computational expertise and costly infrastructure. Collectively, these factors effectively exclude many bench biologists from genome-scale analyses. Web-based visualization and analysis libraries, frameworks, and applications were developed to empower all biological researchers to easily, interactively, and in a visually driven manner, analyze large biomedical datasets that are essential for their research, without bioinformatics expertise and costly hardware
Thesis (PhD) — Boston College, 2014
Submitted to: Boston College. Graduate School of Arts and Sciences
Discipline: Biology
APA, Harvard, Vancouver, ISO, and other styles
3

Hansen, Simon, and Erik Markow. "Big Data : Implementation av Big Data i offentlig verksamhet." Thesis, Högskolan i Halmstad, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-38756.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kämpe, Gabriella. "How Big Data Affects UserExperienceReducing cognitive load in big data applications." Thesis, Umeå universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-163995.

Full text
Abstract:
We have entered the age of big data. Massive data sets are common in enterprises, government, and academia. Interpreting such scales of data is still hard for the human mind. This thesis investigates how proper design can decrease the cognitive load in data-heavy applications. It focuses on numeric data describing economic growth in retail organizations. It aims to answer the questions: What is important to keep in mind when designing an interface that holds large amounts of data? and How to decrease the cognitive load in complex user interfaces without reducing functionality?. It aims to answer these questions by comparing two user interfaces in terms of efficiency, structure, ease of use and navigation. Each interface holds the same functionality and amount of data, but one is designed to increase user experience by reducing cognitive load. The design choices in the second application are based on the theory found in the literature study in the thesis.
APA, Harvard, Vancouver, ISO, and other styles
5

Luo, Changqing. "Towards Secure Big Data Computing." Case Western Reserve University School of Graduate Studies / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case1529929603348119.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Schobel, Seth Adam Micah. "The viral genomics revolution| Big data approaches to basic viral research, surveillance, and vaccine development." Thesis, University of Maryland, College Park, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10011480.

Full text
Abstract:

Since the decoding of the first RNA virus in 1976, the field of viral genomics has exploded, first through the use of Sanger sequencing technologies and later with the use next-generation sequencing approaches. With the development of these sequencing technologies, viral genomics has entered an era of big data. New challenges for analyzing these data are now apparent. Here, we describe novel methods to extend the current capabilities of viral comparative genomics. Through the use of antigenic distancing techniques, we have examined the relationship between the antigenic phenotype and the genetic content of influenza virus to establish a more systematic approach to viral surveillance and vaccine selection. Distancing of Antigenicity by Sequence-based Hierarchical Clustering (DASH) was developed and used to perform a retrospective analysis of 22 influenza seasons. Our methods produced vaccine candidates identical to or with a high concordance of antigenic similarity with those selected by the WHO. In a second effort, we have developed VirComp and OrionPlot: two independent yet related tools. These tools first generate gene-based genome constellations, or genotypes, of viral genomes, and second create visualizations of the resultant genome constellations. VirComp utilizes sequence-clustering techniques to infer genome constellations and prepares genome constellation data matrices for visualization with OrionPlot. OrionPlot is a java application for tailoring genome constellation figures for publication. OrionPlot allows for color selection of gene cluster assignments, customized box sizes to enable the visualization of gene comparisons based on sequence length, and label coloring. We have provided five analyses designed as vignettes to illustrate the utility of our tools for performing viral comparative genomic analyses. Study three focused on the analysis of respiratory syncytial virus (RSV) genomes circulating during the 2012- 2013 RSV season. We discovered a correlation between a recent tandem duplication within the G gene of RSV-A and a decrease in severity of infection. Our data suggests that this duplication is associated with a higher infection rate in female infants than is generally observed. Through these studies, we have extended the state of the art of genotype analysis, phenotype/genotype studies and established correlations between clinical metadata and RSV sequence data.

APA, Harvard, Vancouver, ISO, and other styles
7

Cheelangi, Madhusudan. "Result Distribution in Big Data Systems." Thesis, University of California, Irvine, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=1539891.

Full text
Abstract:

We are building a Big Data Management System (BDMS) called AsterixDB at UCI. Since AsterixDB is designed to operate on large volumes of data, the results for its queries can be potentially very large, and AsterixDB is also designed to operate under high concurency workloads. As a result, we need a specialized mechanism to manage these large volumes of query results and deliver them to the clients. In this thesis, we present an architecture and an implementation of a new result distribution framework that is capable of handling large volumes of results under high concurency workloads. We present the various components of this result distribution framework and show how they interact with each other to manage large volumes of query results and deliver them to clients. We also discuss various result distribution policies that are possible with our framework and compare their performance through experiments.

We have implemented a REST-like HTTP client interface on top of the result distribution framework to allow clients to submit queries and obtain their results. This client interface provides two modes for clients to choose from to read their query results: synchronous mode and asynchronous mode. In synchronous mode, query results are delivered to a client as a direct response to its query within the same request-response cycle. In asynchronous mode, a query handle is returned instead to the client as a response to its query. The client can store the handle and send another request later, including the query handle, to read the result for the query whenever it wants. The architectural support for these two modes is also described in this thesis. We believe that the result distribution framework, combined with this client interface, successfully meets the result management demands of AsterixDB.

APA, Harvard, Vancouver, ISO, and other styles
8

Laurila, M. (Mikko). "Big data in Finnish financial services." Bachelor's thesis, University of Oulu, 2017. http://urn.fi/URN:NBN:fi:oulu-201711243156.

Full text
Abstract:
This thesis aims to explore the concept of big data, and create understanding of big data maturity in the Finnish financial services industry. The research questions of this thesis are “What kind of big data solutions are being implemented in the Finnish financial services sector?” and “Which factors impede faster implementation of big data solutions in the Finnish financial services sector?”. Big data, being a concept usually linked with huge data sets and economies of scale, is an interesting topic for research in Finland, a market in which the size of data sets is somewhat limited by the size of the market. This thesis includes a literature review on the concept of big data, and earlier literature of the Finnish big data landscape, and a qualitative content analysis of available public information on big data maturity in the context of the Finnish financial services market. The results of this research show that in Finland big data is utilized to some extent, at least by the larger organizations. Financial services specific big data solutions include things like the automation of applications handling in insurance. The most clear and specific factors slowing the development of big data maturity in the industry are the lack of competent work-force and new regulations compliance projects taking development resources. These results can be used as an overview of the state of big data maturity in the Finnish financial services industry. This study also lays a solid foundation for further research in the form of conducting interviews, which would provide more in-depth data
Tämän työn tavoitteena on selvittää big data -käsitettä sekä kehittää ymmärrystä Suomen rahoitusalan big data -kypsyydestä. Tutkimuskysymykset tutkielmalle ovat “Millaisia big data -ratkaisuja on otettu käyttöön rahoitusalalla Suomessa?” sekä “Mitkä tekijät hidastavat big data -ratkaisujen implementointia rahoitusalalla Suomessa?”. Big data käsitteenä liitetään yleensä valtaviin datamassoihin ja suuruuden ekonomiaan. Siksi big data onkin mielenkiintoinen aihe tutkittavaksi suomalaisessa kontekstissa, missä datajoukkojen koko on jossain määrin rajoittunut markkinan koon myötä. Työssä esitetään big datan määrittely kirjallisuuteen perustuen sekä esitetään yhteenveto big datan soveltamisesta Suomessa aikaisempiin tutkimuksiin perustuen. Työssä on toteutettu laadullinen aineistoanalyysi julkisesti saatavilla olevasta informaatiosta big datan käytöstä rahoitusalalla Suomessa. Tulokset osoittavat big dataa hyödynnettävän jossain määrin rahoitusalalla Suomessa, ainakin suurikokoisissa organisaatioissa. Rahoitusalalle erityisiä ratkaisuja ovat esimerkiksi hakemuskäsittelyprosessien automatisointi. Selkeimmät big data -ratkaisujen implementointia hidastavat tekijät ovat osaavan työvoiman puute, sekä uusien regulaatioiden asettamat paineet kehitysresursseille. Työ muodostaa eräänlaisen kokonaiskuvan big datan hyödyntämisestä rahoitusalalla Suomessa. Tutkimus perustuu julkisen aineiston analyysiin, mikä osaltaan luo pohjan jatkotutkimukselle aiheesta. Jatkossa haastatteluilla voitaisiinkin edelleen syventää tietämystä aiheesta
APA, Harvard, Vancouver, ISO, and other styles
9

Flike, Felix, and Markus Gervard. "BIG DATA-ANALYS INOM FOTBOLLSORGANISATIONER En studie om big data-analys och värdeskapande." Thesis, Malmö universitet, Fakulteten för teknik och samhälle (TS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20117.

Full text
Abstract:
Big data är ett relativt nytt begrepp men fenomenet har funnits länge. Det går att beskriva utifrån fem V:n; volume, veracity, variety, velocity och value. Analysen av Big Data har kommit att visa sig värdefull för organisationer i arbetet med beslutsfattande, generering av mätbara ekonomiska fördelar och förbättra verksamheten. Inom idrottsbranschen började detta på allvar användas i början av 2000-talet i baseballorganisationen Oakland Athletics. Man började värva spelare baserat på deras statistik istället för hur bra scouterna bedömde deras förmåga vilket gav stora framgångar. Detta ledde till att fler organisationer tog efter och det har inte dröjt länge innan Big Data-analys används i alla stora sporter för att vinna fördelar gentemot konkurrenter. I svensk kontext så är användningen av dessa verktyg fortfarande relativt ny och mångaorganisationer har möjligtvis gått för fort fram i implementeringen av dessa verktyg. Dennastudie syftar till att undersöka fotbollsorganisationers arbete när det gäller deras Big Dataanalys kopplat till organisationens spelare utifrån en fallanalys. Resultatet visar att båda organisationerna skapar värde ur sina investeringar som de har nytta av i arbetet med att nå sina strategiska mål. Detta gör organisationerna på olika sätt. Vilket sätt som är mest effektivt utifrån värdeskapande går inte att svara på utifrån denna studie.
APA, Harvard, Vancouver, ISO, and other styles
10

Nyström, Simon, and Joakim Lönnegren. "Processing data sources with big data frameworks." Thesis, KTH, Data- och elektroteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-188204.

Full text
Abstract:
Big data is a concept that is expanding rapidly. As more and more data is generatedand garnered, there is an increasing need for efficient solutions that can be utilized to process all this data in attempts to gain value from it. The purpose of this thesis is to find an efficient way to quickly process a large number of relatively small files. More specifically, the purpose is to test two frameworks that can be used for processing big data. The frameworks that are tested against each other are Apache NiFi and Apache Storm. A method is devised in order to, firstly, construct a data flow and secondly, construct a method for testing the performance and scalability of the frameworks running this data flow. The results reveal that Apache Storm is faster than Apache NiFi, at the sort of task that was tested. As the number of nodes included in the tests went up, the performance did not always do the same. This indicates that adding more nodes to a big data processing pipeline, does not always result in a better performing setup and that, sometimes, other measures must be made to heighten the performance.
Big data är ett koncept som växer snabbt. När mer och mer data genereras och samlas in finns det ett ökande behov av effektiva lösningar som kan användas föratt behandla all denna data, i försök att utvinna värde från den. Syftet med detta examensarbete är att hitta ett effektivt sätt att snabbt behandla ett stort antal filer, av relativt liten storlek. Mer specifikt så är det för att testa två ramverk som kan användas vid big data-behandling. De två ramverken som testas mot varandra är Apache NiFi och Apache Storm. En metod beskrivs för att, för det första, konstruera ett dataflöde och, för det andra, konstruera en metod för att testa prestandan och skalbarheten av de ramverk som kör dataflödet. Resultaten avslöjar att Apache Storm är snabbare än NiFi, på den typen av test som gjordes. När antalet noder som var med i testerna ökades, så ökade inte alltid prestandan. Detta visar att en ökning av antalet noder, i en big data-behandlingskedja, inte alltid leder till bättre prestanda och att det ibland krävs andra åtgärder för att öka prestandan.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Genomics Big Data Engineering"

1

Wong, Ka-Chun, ed. Big Data Analytics in Genomics. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-41279-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Cui, Zhen, Jinshan Pan, Shanshan Zhang, Liang Xiao, and Jian Yang, eds. Intelligence Science and Big Data Engineering. Visual Data Engineering. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-36189-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Roy, Sanjiban Sekhar, Pijush Samui, Ravinesh Deo, and Stavros Ntalampiras, eds. Big Data in Engineering Applications. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8476-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Feeney, Kevin, James Welch, and Jim Davies. Engineering Agile Big-Data Systems. New York: River Publishers, 2022. http://dx.doi.org/10.1201/9781003338123.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lee, Roger, ed. Big Data, Cloud Computing, Data Science & Engineering. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-319-96803-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Lee, Roger, ed. Big Data, Cloud Computing, and Data Science Engineering. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-24405-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lee, Roger, ed. Big Data, Cloud Computing, and Data Science Engineering. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-19608-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Cui, Zhen, Jinshan Pan, Shanshan Zhang, Liang Xiao, and Jian Yang, eds. Intelligence Science and Big Data Engineering. Big Data and Machine Learning. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-36204-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

He, Xiaofei, Xinbo Gao, Yanning Zhang, Zhi-Hua Zhou, Zhi-Yong Liu, Baochuan Fu, Fuyuan Hu, and Zhancheng Zhang, eds. Intelligence Science and Big Data Engineering. Image and Video Data Engineering. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23989-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sun, Yi, Huchuan Lu, Lihe Zhang, Jian Yang, and Hua Huang, eds. Intelligence Science and Big Data Engineering. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-67777-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Genomics Big Data Engineering"

1

Borovska, Plamenka, Veska Gancheva, and Ivailo Georgiev. "Platform for Adaptive Knowledge Discovery and Decision Making Based on Big Genomics Data Analytics." In Bioinformatics and Biomedical Engineering, 297–308. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-17935-9_27.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Habyarimana, Ephrem, and Sofia Michailidou. "Genomics Data." In Big Data in Bioeconomy, 69–76. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-71069-9_6.

Full text
Abstract:
AbstractIn silico prediction of plant performance is gaining increasing breeders’ attention. Several statistical, mathematical and machine learning methodologies for analysis of phenotypic, omics and environmental data typically use individual or a few data layers. Genomic selection is one of the applications, where heterogeneous data, such as those from omics technologies, are handled, accommodating several genetic models of inheritance. There are many new high throughput Next Generation Sequencing (NGS) platforms on the market producing whole-genome data at a low cost. Hence, large-scale genomic data can be produced and analyzed enabling intercrosses and fast-paced recurrent selection. The offspring properties can be predicted instead of manually evaluated in the field . Breeders have a short time window to make decisions by the time they receive data, which is one of the major challenges in commercial breeding. To implement genomic selection routinely as part of breeding programs, data management systems and analytics capacity have therefore to be in order. The traditional relational database management systems (RDBMS), which are designed to store, manage and analyze large-scale data, offer appealing characteristics, particularly when they are upgraded with capabilities for working with binary large objects. In addition, NoSQL systems were considered effective tools for managing high-dimensional genomic data. MongoDB system, a document-based NoSQL database, was effectively used to develop web-based tools for visualizing and exploring genotypic information. The Hierarchical Data Format (HDF5), a member of the high-performance distributed file systems family, demonstrated superior performance with high-dimensional and highly structured data such as genomic sequencing data.
APA, Harvard, Vancouver, ISO, and other styles
3

Talukder, Asoke K. "Genomics 3.0: Big-data in Precision Medicine." In Big Data Analytics, 201–15. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-27057-9_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Fatima, Tahmeena, and S. Jyothi. "Genomics in Big Data Bioinformatics." In Learning and Analytics in Intelligent Systems, 661–67. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-46939-9_60.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Meyer, Lars-Peter, Jan Frenzel, Eric Peukert, René Jäkel, and Stefan Kühne. "Big Data Services." In Service Engineering, 63–77. Wiesbaden: Springer Fachmedien Wiesbaden, 2018. http://dx.doi.org/10.1007/978-3-658-20905-6_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Chan, Lawrence S. "Big Data Analytics." In Engineering-Medicine, 113–21. Boca Raton, FL : CRC Press/Taylor & Francis Group, [2018] | “A Science Publishers book.”: CRC Press, 2019. http://dx.doi.org/10.1201/9781351012270-13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Tamburri, Damian, and Willem-Jan van den Heuvel. "Big Data Engineering." In Data Science for Entrepreneurship, 25–35. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-19554-9_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Agarwal, Mahima, Mohamood Adhil, and Asoke K. Talukder. "Multi-omics Multi-scale Big Data Analytics for Cancer Genomics." In Big Data Analytics, 228–43. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-27057-9_16.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Khoria, Vinamra, Amit Kumar, and Sanjiban Shekhar Roy. "Leukaemia Classification Using Machine Learning and Genomics." In Studies in Big Data, 87–99. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-16-9158-4_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Tam, Nguyen Thanh, and Insu Song. "Big Data Visualization:." In Lecture Notes in Electrical Engineering, 399–408. Singapore: Springer Singapore, 2016. http://dx.doi.org/10.1007/978-981-10-0557-2_40.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Genomics Big Data Engineering"

1

Jahangir, Sidrah, Peter John, Attya Bhatti, Muhammad Muaaz Aslam, and Mandy J. Peffers. "Data Integration for Big Data analytics to identify the gaps in Rheumatoid Arthritis Genomics in a Post-GWAS era." In 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE). IEEE, 2022. http://dx.doi.org/10.1109/icacite53722.2022.9823601.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Borovska, Plamenka, and Desislava Ivanova. "Intelligent method for adaptive in silico knowledge discovery based on big genomic data analytics." In PROCEEDINGS OF THE 44TH INTERNATIONAL CONFERENCE ON APPLICATIONS OF MATHEMATICS IN ENGINEERING AND ECONOMICS: (AMEE’18). Author(s), 2018. http://dx.doi.org/10.1063/1.5082116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ivanova, Desislava, and Plamenka Borovska. "Scalable framework for adaptive in-silico knowledge discovery and decision-making out of genomic big data." In PROCEEDINGS OF THE 44TH INTERNATIONAL CONFERENCE ON APPLICATIONS OF MATHEMATICS IN ENGINEERING AND ECONOMICS: (AMEE’18). Author(s), 2018. http://dx.doi.org/10.1063/1.5082134.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Godhandaraman, T., N. Pruthviraj, V. Praveenkumar, A. Banuprasad, and K. Karthick. "Big data in genomics." In 2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET). IEEE, 2017. http://dx.doi.org/10.1109/icammaet.2017.8186739.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Gancheva, Veska, and Ivailo Georgiev. "Software architecture for adaptive in silico knowledge discovery and decision making based on big genomic data analytics." In PROCEEDINGS OF THE 45TH INTERNATIONAL CONFERENCE ON APPLICATION OF MATHEMATICS IN ENGINEERING AND ECONOMICS (AMEE’19). AIP Publishing, 2019. http://dx.doi.org/10.1063/1.5133586.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Bhardwaj, Ruchie, Adhiraaj Sethi, and Raghunath Nambiar. "Big data in genomics: An overview." In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 2014. http://dx.doi.org/10.1109/bigdata.2014.7004392.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Tasoulis, Sotiris, Lu Cheng, Niko Valimaki, Nicholas J. Croucher, Simon R. Harris, William P. Hanage, Teemu Roos, and Jukka Corander. "Random projection based clustering for population genomics." In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 2014. http://dx.doi.org/10.1109/bigdata.2014.7004291.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Yeo, Hangu, and Catherine H. Crawford. "Big Data: Cloud computing in genomics applications." In 2015 IEEE International Conference on Big Data (Big Data). IEEE, 2015. http://dx.doi.org/10.1109/bigdata.2015.7364117.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Langmead, Ben. "Practical software for big genomics data." In 2013 IEEE 3rd International Conference on Computational Advances in Bio and Medical Sciences (ICCABS). IEEE, 2013. http://dx.doi.org/10.1109/iccabs.2013.6629241.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kochunov, Peter, Li Shen, John Darrell van Horn, and Paul M. Thompson. "Session Introduction: Big Data Imaging Genomics." In Pacific Symposium on Biocomputing 2022. WORLD SCIENTIFIC, 2021. http://dx.doi.org/10.1142/9789811250477_0007.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Genomics Big Data Engineering"

1

Greenberg, Jane, Samantha Grabus, Florence Hudson, Tim Kraska, Samuel Madden, René Bastón, and Katie Naum. The Northeast Big Data Innovation Hub: "Enabling Seamless Data Sharing in Industry and Academia" Workshop Report. Drexel University, March 2017. http://dx.doi.org/10.17918/d8159v.

Full text
Abstract:
Increasingly, both industry and academia, in fields ranging from biology and social sciences to computing and engineering, are driven by data (Provost & Fawcett, 2013; Wixom, et al, 2014); and both commercial success and academic impact are dependent on having access to data. Many organizations collecting data lack the expertise required to process it (Hazen, et al, 2014), and, thus, pursue data sharing with researchers who can extract more value from data they own. For example, a biosciences company may benefit from a specific analysis technique a researcher has developed. At the same time, researchers are always on the search for real-world data sets to demonstrate the effectiveness of their methods. Unfortunately, many data sharing attempts fail, for reasons ranging from legal restrictions on how data can be used—to privacy policies, different cultural norms, and technological barriers. In fact, many data sharing partnerships that are vital to addressing pressing societal challenges in cities, health, energy, and the environment are not being pursued due to such obstacles. Addressing these data sharing challenges requires open, supportive dialogue across many sectors, including technology, policy, industry, and academia. Further, there is a crucial need for well-defined agreements that can be shared among key stakeholders, including researchers, technologists, legal representatives, and technology transfer officers. The Northeast Big Data Innovation Hub (NEBDIH) took an important step in this area with the recent "Enabling Seamless Data Sharing in Industry and Academia" workshop, held at Drexel University September 29-30, 2016. The workshop brought together representatives from these critical stakeholder communities to launch a national dialogue on challenges and opportunities in this complex space.
APA, Harvard, Vancouver, ISO, and other styles
2

Breiman, Adina, Jan Dvorak, Abraham Korol, and Eduard Akhunov. Population Genomics and Association Mapping of Disease Resistance Genes in Israeli Populations of Wild Relatives of Wheat, Triticum dicoccoides and Aegilops speltoides. United States Department of Agriculture, December 2011. http://dx.doi.org/10.32747/2011.7697121.bard.

Full text
Abstract:
Wheat is the most widely grown crop on earth, together with rice it is second to maize in total global tonnage. One of the emerging threats to wheat is stripe (yellow) rust, especially in North Africa, West and Central Asia and North America. The most efficient way to control plant diseases is to introduce disease resistant genes. However, the pathogens can overcome rapidly the effectiveness of these genes when they are wildly used. Therefore, there is a constant need to find new resistance genes to replace the non-effective genes. The resistance gene pool in the cultivated wheat is depleted and there is a need to find new genes in the wild relative of wheat. Wild emmer (Triticum dicoccoides) the progenitor of the cultivated wheat can serve as valuable gene pool for breeding for disease resistance. Transferring of novel genes into elite cultivars is highly facilitated by the availability of information of their chromosomal location. Therefore, our goals in this study was to find stripe rust resistant and susceptible genotypes in Israeli T. dicoccoides population, genotype them using state of the art genotyping methods and to find association between genetic markers and stripe rust resistance. We have screened 129 accessions from our collection of wild emmer wheat for resistance to three isolates of stripe rust. About 30% of the accessions were resistant to one or more isolates, 50% susceptible, and the rest displayed intermediate response. The accessions were genotyped with Illumina'sInfinium assay which consists of 9K single nucleotide polymorphism (SNP) markers. About 13% (1179) of the SNPs were polymorphic in the wild emmer population. Cluster analysis based on SNP diversity has shown that there are two main groups in the wild population. A big cluster probably belongs to the Horanum ssp. and a small cluster of the Judaicum ssp. In order to avoid population structure bias, the Judaicum spp. was removed from the association analysis. In the remaining group of genotypes, linkage disequilibrium (LD) measured along the chromosomes decayed rapidly within one centimorgan. This is the first time when such analysis is conducted on a genome wide level in wild emmer. Such a rapid decay in LD level, quite unexpected for a selfer, was not observed in cultivated wheat collection. It indicates that wild emmer populations are highly suitable for association studies yielding a better resolution than association studies in cultivated wheat or genetic mapping in bi-parental populations. Significant association was found between an SNP marker located in the distal region of chromosome arm 1BL and resistance to one of the isolates. This region is not known in the literature to bear a stripe rust resistance gene. Therefore, there may be a new stripe rust resistance gene in this locus. With the current fast increase of wheat genome sequence data, genome wide association analysis becomes a feasible task and efficient strategy for searching novel genes in wild emmer wheat. In this study, we have shown that the wild emmer gene pool is a valuable source for new stripe rust resistance genes that can protect the cultivated wheat.
APA, Harvard, Vancouver, ISO, and other styles
3

Semerikov, Serhiy, Illia Teplytskyi, Yuliia Yechkalo, Oksana Markova, Vladimir Soloviev, and Arnold Kiv. Computer Simulation of Neural Networks Using Spreadsheets: Dr. Anderson, Welcome Back. [б. в.], June 2019. http://dx.doi.org/10.31812/123456789/3178.

Full text
Abstract:
The authors of the given article continue the series presented by the 2018 paper “Computer Simulation of Neural Networks Using Spreadsheets: The Dawn of the Age of Camelot”. This time, they consider mathematical informatics as the basis of higher engineering education fundamentalization. Mathematical informatics deals with smart simulation, information security, long-term data storage and big data management, artificial intelligence systems, etc. The authors suggest studying basic principles of mathematical informatics by applying cloud-oriented means of various levels including those traditionally considered supplementary – spreadsheets. The article considers ways of building neural network models in cloud-oriented spreadsheets, Google Sheets. The model is based on the problem of classifying multi-dimensional data provided in “The Use of Multiple Measurements in Taxonomic Problems” by R. A. Fisher. Edgar Anderson’s role in collecting and preparing the data in the 1920s-1930s is discussed as well as some peculiarities of data selection. There are presented data on the method of multi-dimensional data presentation in the form of an ideograph developed by Anderson and considered one of the first efficient ways of data visualization.
APA, Harvard, Vancouver, ISO, and other styles
4

Microbiology in the 21st Century: Where Are We and Where Are We Going? American Society for Microbiology, 2004. http://dx.doi.org/10.1128/aamcol.5sept.2003.

Full text
Abstract:
The American Academy of Microbiology convened a colloquium September 5–7, 2003, in Charleston, South Carolina to discuss the central importance of microbes to life on earth, directions microbiology research will take in the 21st century, and ways to foster public literacy in this important field. Discussions centered on: the impact of microbes on the health of the planet and its inhabitants; the fundamental significance of microbiology to the study of all life forms; research challenges faced by microbiologists and the barriers to meeting those challenges; the need to integrate microbiology into school and university curricula; and public microbial literacy. This is an exciting time for microbiology. We are becoming increasingly aware that microbes are the basis of the biosphere. They are the ancestors of all living things and the support system for all other forms of life. Paradoxically, certain microbes pose a threat to human health and to the health of plants and animals. As the foundation of the biosphere and major determinants of human health, microbes claim a primary, fundamental role in life on earth. Hence, the study of microbes is pivotal to the study of all living things, and microbiology is essential for the study and understanding of all life on this planet. Microbiology research is changing rapidly. The field has been impacted by events that shape public perceptions of microbes, such as the emergence of globally significant diseases, threats of bioterrorism, increasing failure of formerly effective antibiotics and therapies to treat microbial diseases, and events that contaminate food on a large scale. Microbial research is taking advantage of the technological advancements that have opened new fields of inquiry, particularly in genomics. Basic areas of biological complexity, such as infectious diseases and the engineering of designer microbes for the benefit of society, are especially ripe areas for significant advancement. Overall, emphasis has increased in recent years on the evolution and ecology of microorganisms. Studies are focusing on the linkages between microbes and their phylogenetic origins and between microbes and their habitats. Increasingly, researchers are striving to join together the results of their work, moving to an integration of biological phenomena at all levels. While many areas of the microbiological sciences are ripe for exploration, microbiology must overcome a number of technological hurdles before it can fully accomplish its potential. We are at a unique time when the confluence of technological advances and the explosion of knowledge of microbial diversity will enable significant advances in microbiology, and in biology in general, over the next decade. To make the best progress, microbiology must reach across traditional departmental boundaries and integrate the expertise of scientists in other disciplines. Microbiologists are becoming increasingly aware of the need to harness the vast computing power available and apply it to better advantage in research. Current methods for curating research materials and data should be rethought and revamped. Finally, new facilities should be developed to house powerful research equipment and make it available, on a regional basis, to scientists who might otherwise lack access to the expensive tools of modern biology. It is not enough to accomplish cutting-edge research. We must also educate the children and college students of today, as they will be the researchers of tomorrow. Since microbiology provides exceptional teaching tools and is of pivotal importance to understanding biology, science education in schools should be refocused to include microbiology lessons and lab exercises. At the undergraduate level, a thorough knowledge of microbiology should be made a part of the core curriculum for life science majors. Since issues that deal with microbes have a direct bearing on the human condition, it is critical that the public-at-large become better grounded in the basics of microbiology. Public literacy campaigns must identify the issues to be conveyed and the best avenues for communicating those messages. Decision-makers at federal, state, local, and community levels should be made more aware of the ways that microbiology impacts human life and the ways school curricula could be improved to include valuable lessons in microbial science.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography