Rozprawy doktorskie: „Read data”

1

Burger, Joseph. "Real-time engagement area dvelopment program (READ-Pro)". Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2002. http://library.nps.navy.mil/uhtbin/hyperion-image/02Jun%5FBurger.pdf.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

2

Lecompte, Lolita. "Structural variant genotyping with long read data". Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S054.

Pełny tekst źródła

Streszczenie:

Les variants de structure (SVs) sont des réarrangements génomiques de plus de 50 paires de base et restent encore aujourd'hui peu étudiés malgré les impacts importants qu'ils peuvent avoir sur le fonctionnement des génomes. Récemment, les technologies de séquençage de troisième génération ont été développées et produisent des données de longues lectures qui s'avèrent très utiles car elles peuvent chevaucher les réarrangements. À l'heure actuelle, les méthodes bioinformatiques se sont concentrées sur le problème de la découverte de SVs avec des données de longues lectures. Aucune méthode n'a cependant été proposée pour répondre spécifiquement à la question du génotypage de SVs avec ce même type de données. L'objectif du génotypage de SVs vise pour un ensemble de SVs donné à évaluer les allèles présents dans un nouvel échantillon séquencé. Cette thèse propose une nouvelle méthode pour génotyper des SVs avec des longues lectures et repose sur la représentation des séquences des allèles. Notre méthode a été implémentée dans l'outil SVJedi. Nous avons testé notre outil à la fois sur des données simulées et réelles afin de valider notre méthode. SVJedi obtient une précision élevée qui dépasse les performances des autres outils de génotypage de SVs, notamment des outils de détection de SVs et des outils de génotypage de SVs de lectures courtes
Structural Variants (SVs) are genomic rearrangements of more than 50 base pairs. Since SVs can reach several thousand base pairs, they can have huge impacts on genome functions, studying SVs is, therefore, of great interest. Recently, a new generation of sequencing technologies has been developed and produce long read data of tens of thousand of base pairs which are particularly useful for spanning over SV breakpoints. So far, bioinformatics methods have focused on the SV discovery problem with long read data. However, no method has been proposed to specifically address the issue of genotyping SVs with long read data. The purpose of SV genotyping is to assess for each variant of a given input set which alleles are present in a newly sequenced sample. This thesis proposes a new method for genotyping SVs with long read data, based on the representation of each allele sequences. We also defined a set of conditions to consider a read as supporting an allele. Our method has been implemented in a tool called SVJedi. Our tool has been validated on both simulated and real human data and achieves high genotyping accuracy. We show that SVJedi obtains better performances than other existing long read genotyping tools and we also demonstrate that SV genotyping is considerably improved with SVJedi compared to other approaches, namely SV discovery and short read SV genotyping approaches

Style APA, Harvard, Vancouver, ISO itp.

3

Walter, Sarah. "Parallel read/write system for optical data storage". Diss., Connect to online resource, 2005. http://wwwlib.umi.com/cr/colorado/fullcit?p1425767.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

4

Ibanez, Luis Daniel. "Towards a read/write web of linked data". Nantes, 2015. http://archive.bu.univ-nantes.fr/pollux/show.action?id=9089939a-874b-44e1-a049-86a4c5c5d0e6.

Pełny tekst źródła

Streszczenie:

L’initiative «Web des données» a mis en disponibilité des millions des données pour leur interrogation par une fédération de participants autonomes. Néanmoins, le Web des Données a des problèmes de hétérogénéité et qualité. Nous considérons le problème de hétérogèneité comme une médiation «Local-as-View» (LAV). Malheureusement, LAV peut avoir besoin d’exécuter un certain nombre de « reformulations » exponentiel dans le nombre de sous-objectifs d’une requête. Nous proposons l’algorithme «Graph-Union» (GUN) pour maximiser les résultats obtenus á partir d’un sous-ensemble de reformulations. GUN réduit le temps d’exécution et maximise les résultats en échange d’une utilisation de la mémoire plus élevée. Pour permettre aux participants d’améliorer la qualité des données, il est nécessaire de faire évoluer le Web des Données vers Lecture-Écriture, par contre, l’écriture mutuelle des données entre participants autonomes pose des problèmes de cohérence. Nous modélisons le Web des Données en Lecture -Écriture comme un réseau social où les acteurs copient les données que leur intéressent, les corrigent et publient les mises à jour pour les échanger. Nous proposons deux algorithmes pour supporter cet échange : SU-Set, qui garantit la Cohérence Inéluctable Forte (CIF), et Col-Graph, qui garantit la Cohérence des Fragments, plus forte que CIF. Nous étudions les complexités des deux algorithmes et nous estimons expérimentalement le cas moyen de Col-Graph, les résultats suggèrant qu'il est faisable pour des topologies sociales
The Linked Data initiative has made available millions of pieces of data for querying through a federation of autonomous participants. However, the Web of Linked data suffers of problems of data heterogeneity and quality. We cast the problem of integrating heterogeneous data sources as a Local-as-View mediation (LAV) problem, unfortunately, LAV may require the execution of a number of “rewritings” exponential on the number of query subgoals. We propose the Graph-Union (GUN) strategy to maximise the results obtained from a subset of rewritings. Compared to traditional rewriting execution strategies, GUN improves execution time and number of results obtained in exchange of higher memory consumption. Once data can be queried data consumers can detect quality issues, but to resolve them they need to write on the data of the sources, i. E. , to evolve Linked Data from Read/Only to Read-Write. However, writing among autonomous participants raises consistency issues. We model the Read-Write Linked Data as a social network where actors copy the data they are interested into, update it and publish updates to exchange with others. We propose two algorithms for update exchange: SU-Set, that achieves Strong Eventual Consistency (SEC) and Col-Graph, that achieves Fragment Consistency, stronger than SEC. We analyze the worst and best case complexities of both algorithms and estimate experimentally the average complexity of Col-Graph, results suggest that is feasible for social network topologies

Style APA, Harvard, Vancouver, ISO itp.

5

Horne, Ross J. "Programming languages and principles for read-write linked data". Thesis, University of Southampton, 2011. https://eprints.soton.ac.uk/210899/.

Pełny tekst źródła

Streszczenie:

This work addresses a gap in the foundations of computer science. In particular, only a limited number of models address design decisions in modern Web architectures. The development of the modern Web architecture tends to be guided by the intuition of engineers. The intuition of an engineer is probably more powerful than any model; however, models are important tools to aid principled design decisions. No model is sufficiently strong to provide absolute certainty of correctness; however, an architecture accompanied by a model is stronger than an architecture accompanied solely by intuition lead by the personal, hence subjective, subliminal ego. The Web of Data describes an architecture characterised by key W3C standards. Key standards include a semi-structured data format, entailment mechanism and query language. Recently, prominent figures have drawn attention to the necessity of update languages for the Web of Data, coining the notion of Read–Write Linked Data. A dynamicWeb of Data with updates is a more realistic reflection of the Web. An established and versatile approach to modelling dynamic languages is to define an operational semantics. This work provides such an operational semantics for a Read–Write Linked Data architecture. Furthermore, the model is sufficiently general to capture the established standards, including queries and entailments. Each feature is relative easily modelled in isolation; however a model which checks that the key standards socialise is a greater challenge to which operational semantics are suited. The model validates most features of the standards while raising some serious questions. Further to evaluating W3C standards, the operational mantics provides a foundation for static analysis. One approach is to derive an algebra for the model. The algebra is proven to be sound with respect to the operational semantics. Soundness ensures that the algebraic rules preserve operational behaviour. If the algebra establishes that two updates are equivalent, then they have the same operational capabilities. This is useful for optimisation, since the real cost of executing the updates may differ, despite their equivalent expressive powers. A notion of operational refinement is discussed, which allows a non-deterministic update to be refined to a more deterministic update. Another approach to the static analysis of Read–Write Linked Data is through a type system. The simplest type system for this application simply checks that well understood terms which appear in the semi-structured data, such as numbers and strings of characters, are used correctly. Static analysis then verifies that basic runtime errors in a well typed program do not occur. Type systems for URIs are also investigated, inspired by W3C standards. Type systems for URIs are controversial, since URIs have no internal structure thus have no obvious non-trivial types. Thus a flexible type system which accommodates several approaches to typing URIs is proposed.

Style APA, Harvard, Vancouver, ISO itp.

6

Huang, Songbo, i 黄颂博. "Detection of splice junctions and gene fusions via short read alignment". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2011. http://hub.hku.hk/bib/B45862527.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

7

Saleem, Muhammad. "Automated Analysis of Automotive Read-Out Data for Better Decision Making". Thesis, Linköpings universitet, Institutionen för datavetenskap, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-63785.

Pełny tekst źródła

Streszczenie:

The modern automobile is a complex electromechanical system controlled by control systems which consist of several interdependent electronic control units (ECUs). Analysis of the data generated by these modules is very important in order to observe the interesting patterns among data. At Volvo Cars Corporation today, diagnostic read-out data is retrieved from client machines installed at workshops in different countries around the world. The problem with this data is that it does not show a clear picture as what is causing what i.e. tracking the problem. Diagnostic engineers at Volvo Cars Corporation perform routine based statistical analysis of diagnostic read-out data manually, which is time consuming and tedious work. Moreover, this analysis is restricted to basic level mainly statistical analysis of diagnostic readout data. We present an approach based on statistical analysis and cluster analysis. Our approach focused on analysing the data from a pure statistical stand-point to isolate the problem in diagnostic read-out data, thereby helping to visualize and analyse the nature of the problem at hand. Different general statistical formulae were applied to get meaningful information from large amount of DRO data. Cluster analysis was carried out to get clusters consisting of similar trouble codes. Different methods and techniques were considered for the purpose of cluster analysis. Hierarchical and non-hierarchical clusters were extracted by applying appropriate algorithms. The results obtained from the thesis work show that the diagnostic read-out data consist of independent and interdependent fault codes. Groups were generated which consist of similar trouble codes. Furthermore, corresponding factors from freeze frame data which shows significant variation for these groups were also extracted. These faults, groups of faults and factors were later interpreted and validated by diagnostic engineers.

Style APA, Harvard, Vancouver, ISO itp.

8

Frousios, Kimon. "Bioinformatic analysis of genomic sequencing data : read alignment and variant evaluation". Thesis, King's College London (University of London), 2014. http://kclpure.kcl.ac.uk/portal/en/theses/bioinformatic-analysis-of-genomic-sequencing-data(e3a55df7-543e-4eaa-a81e-6534eacf6250).html.

Pełny tekst źródła

Streszczenie:

The invention and rise in popularity of Next Generation Sequencing technologies has led to a steep increase of sequencing data and the rise of new challenges. This thesis aims to contribute methods for the analysis of NGS data, and focuses on two of the challenges presented by these data. The first challenge regards the need for NGS reads to be aligned to a reference sequence, as their short length complicates direct assembly. A great number of tools exist that carry out this task quickly and efficiently, yet they all rely on the mere count of mismatches in order to assess alignments, ignoring the knowledge that genome composition and mutation frequencies are biased. Thus, the use of a scoring matrix that incorporates the mutation and composition biases observed among humans was tested with simulated reads. The scoring matrix was implemented and incorporated into the in-house algorithm REAL, allowing side-by-side comparison of the performance of the biased model and the mismatch count. The algorithm REAL was also used to investigate the applicability of NGS RNA-seq data to the understanding of the relationship between genomic expression and the compartmentalisation of genomic base composition into isochores. The second challenge regards the evaluation of the variants (SNPs) that are discovered by sequencing. NGS technologies have caused a sharp rise in the rate with which new SNPs are discovered, rendering impossible the experimental validation of each one. Several tools exist that take into account various properties of the genome, the transcripts and the protein products relevant to the location of a SNP and attempt to predict the SNP's impact. These tools are valuable in screening and prioritising SNPs likely to have a causative association with a genetic disease of interest. Despite the number of individual tools and the diversity of their resources, no attempt had been made to draw a consensus among them. Two consensus approaches were considered, one based on a very simplistic vote majority of the tools considered, and one based on machine learning. Both methods proved to offer highly competitive classification both against the individual tools and against other consensus methods that were published in the meantime.

Style APA, Harvard, Vancouver, ISO itp.

9

Hoffmann, Steve. "Genome Informatics for High-Throughput Sequencing Data Analysis". Doctoral thesis, Universitätsbibliothek Leipzig, 2014. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-152643.

Pełny tekst źródła

Streszczenie:

This thesis introduces three different algorithmical and statistical strategies for the analysis of high-throughput sequencing data. First, we introduce a heuristic method based on enhanced suffix arrays to map short sequences to larger reference genomes. The algorithm builds on the idea of an error-tolerant traversal of the suffix array for the reference genome in conjunction with the concept of matching statistics introduced by Chang and a bitvector based alignment algorithm proposed by Myers. The algorithm supports paired-end and mate-pair alignments and the implementation offers methods for primer detection, primer and poly-A trimming. In our own benchmarks as well as independent bench- marks this tool outcompetes other currently available tools with respect to sensitivity and specificity in simulated and real data sets for a large number of sequencing protocols. Second, we introduce a novel dynamic programming algorithm for the spliced alignment problem. The advantage of this algorithm is its capability to not only detect co-linear splice events, i.e. local splice events on the same genomic strand, but also circular and other non-collinear splice events. This succinct and simple algorithm handles all these cases at the same time with a high accuracy. While it is at par with other state- of-the-art methods for collinear splice events, it outcompetes other tools for many non-collinear splice events. The application of this method to publically available sequencing data led to the identification of a novel isoform of the tumor suppressor gene p53. Since this gene is one of the best studied genes in the human genome, this finding is quite remarkable and suggests that the application of our algorithm could help to identify a plethora of novel isoforms and genes. Third, we present a data adaptive method to call single nucleotide variations (SNVs) from aligned high-throughput sequencing reads. We demonstrate that our method based on empirical log-likelihoods automatically adjusts to the quality of a sequencing experiment and thus renders a \"decision\" on when to call an SNV. In our simulations this method is at par with current state-of-the-art tools. Finally, we present biological results that have been obtained using the special features of the presented alignment algorithm
Diese Arbeit stellt drei verschiedene algorithmische und statistische Strategien für die Analyse von Hochdurchsatz-Sequenzierungsdaten vor. Zuerst führen wir eine auf enhanced Suffixarrays basierende heuristische Methode ein, die kurze Sequenzen mit grossen Genomen aligniert. Die Methode basiert auf der Idee einer fehlertoleranten Traversierung eines Suffixarrays für Referenzgenome in Verbindung mit dem Konzept der Matching-Statistik von Chang und einem auf Bitvektoren basierenden Alignmentalgorithmus von Myers. Die vorgestellte Methode unterstützt Paired-End und Mate-Pair Alignments, bietet Methoden zur Erkennung von Primersequenzen und zum trimmen von Poly-A-Signalen an. Auch in unabhängigen Benchmarks zeichnet sich das Verfahren durch hohe Sensitivität und Spezifität in simulierten und realen Datensätzen aus. Für eine große Anzahl von Sequenzierungsprotokollen erzielt es bessere Ergebnisse als andere bekannte Short-Read Alignmentprogramme. Zweitens stellen wir einen auf dynamischer Programmierung basierenden Algorithmus für das spliced alignment problem vor. Der Vorteil dieses Algorithmus ist seine Fähigkeit, nicht nur kollineare Spleiß- Ereignisse, d.h. Spleiß-Ereignisse auf dem gleichen genomischen Strang, sondern auch zirkuläre und andere nicht-kollineare Spleiß-Ereignisse zu identifizieren. Das Verfahren zeichnet sich durch eine hohe Genauigkeit aus: während es bei der Erkennung kollinearer Spleiß-Varianten vergleichbare Ergebnisse mit anderen Methoden erzielt, schlägt es die Wettbewerber mit Blick auf Sensitivität und Spezifität bei der Vorhersage nicht-kollinearer Spleißvarianten. Die Anwendung dieses Algorithmus führte zur Identifikation neuer Isoformen. In unserer Publikation berichten wir über eine neue Isoform des Tumorsuppressorgens p53. Da dieses Gen eines der am besten untersuchten Gene des menschlichen Genoms ist, könnte die Anwendung unseres Algorithmus helfen, eine Vielzahl weiterer Isoformen bei weniger prominenten Genen zu identifizieren. Drittens stellen wir ein datenadaptives Modell zur Identifikation von Single Nucleotide Variations (SNVs) vor. In unserer Arbeit zeigen wir, dass sich unser auf empirischen log-likelihoods basierendes Modell automatisch an die Qualität der Sequenzierungsexperimente anpasst und eine \"Entscheidung\" darüber trifft, welche potentiellen Variationen als SNVs zu klassifizieren sind. In unseren Simulationen ist diese Methode auf Augenhöhe mit aktuell eingesetzten Verfahren. Schließlich stellen wir eine Auswahl biologischer Ergebnisse vor, die mit den Besonderheiten der präsentierten Alignmentverfahren in Zusammenhang stehen

Style APA, Harvard, Vancouver, ISO itp.

10

Wang, Frank Zhigang. "Advanced magnetic thin-film heads under read-while-write operation". Thesis, University of Plymouth, 1999. http://hdl.handle.net/10026.1/2353.

Pełny tekst źródła

Streszczenie:

A Read-While-Write (RWW) operation for tape and/or potentially disk applications is needed in the following three cases: 1. High reliability; 2. Data servo systems; 3. Buried servo systems. All these applications mean that the read (servo) head and write head are operative simultaneously. Consequently, RWW operation will require work to suppress the so-called crossfeed field radiation from the write head. Traditionally, write-read crossfeed has been reduced in conventional magnetic recording heads by a variety of screening methods, but the effectness of these methods is very limited. On the other hand, the early theoretical investigations of the crossfeed problem concentrating on the flux line pattern in front of a head structure based on a simplified model, may not be comprehensive. Today a growing number of magnetic recording equipment manufacturers employ thin-film technology to fabricate heads and thereby the size of the modern head is much smaller than in the past. The increasing use of thin-film metallic magnetic materials for heads, along with the appearance of other new technologies, such as the MR reproductive mode and keepered media, has stimulated the need for an increased understanding of the crossfeed problem by advanced analysis methods and a satisfactory practical solution to achieve the RWW operation. The work described in this thesis to suppress the crossfeed field involves both a novel reproductive mode of a Dual Magnetoresistive (DMR) head, which was originally designed to gain a large reproduce sensitivity at high linear recording densities exceeding 100 kFCI, playing the key role in suppressing the crossfeed (the corresponding signal-noise ratio is over 38 dB), and several other compensation schemes, giving further suppression. Advanced analytical and numerical methods of estimating crossfeed in single and multi track thin-film/MR heads under both DC and AC excitations can often help a head designer understand how the crossfeed field spreads and therefore how to suppress the crossfeed field from the standpoint of an overall head configuration. This work also assesses the scale of the crossfeed problem by making measurements on current and improved heads, thereby adapting the main contributors to crossfeed. The relevance of this work to the computer industry is clear for achieving simultaneous operation of the read head and write head, especially in a thin-film head assembly. This is because computer data rates must increase to meet the demands of storing more and more information in less time as computer graphics packages become more sophisticated.

Style APA, Harvard, Vancouver, ISO itp.

11

Gallo, John T. "Design of a holographic read-only-memory for parallel data transfer to integrated CMOS circuits". Diss., Georgia Institute of Technology, 1991. http://hdl.handle.net/1853/15640.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

12

Häuser, Philipp. "Caching and prefetching for efficient read access to multidimensional wave propagation data on disk". [S.l. : s.n.], 2007. http://nbn-resolving.de/urn:nbn:de:bsz:93-opus-33500.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

13

Huang, Chenhao. "Choosing read location: understanding and controlling the performance-staleness trade-off in primary backup data stores". Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/27422.

Pełny tekst źródła

Streszczenie:

Many distributed databases deploy primary-copy asynchronous replication, and offer programmer control so reads can be directed to either the primary node (which is always up-to-date, but may be heavily loaded by all the writes) or the secondaries (which may be less loaded, but perhaps have stale data). One example is MongoDB, a popular document store that is both available as a cloud-hosted service and can be deployed on-premises. The state-of-practice is to express where the reads are routed directly in the code, at application development time, based on the programmers' imperfect expectations of what workload will be applied to the system and what hardware will be running the code. In this approach, the programmers' choice may perform badly under some workload patterns which could arise during run-time. Furthermore, it might not be able to utilize the given resources to their full potential -- meaning database customers pay more money than needed. This thesis helps the programmers to choose where to route the read requests by understanding how their choices affect system performance in MongoDB, using extensive experiments on various workloads. It also describes various ways to determine data staleness both from the server side and from the client side. A novel system called Decongestant is proposed, which will automatically and dynamically, as the application is running, choose where to direct reads. We build a prototype system for Decongestant and we run experiments to demonstrate that Decongestant is able to send enough reads to secondaries when this will reduce load on a congested primary and boost the performance of the database as a whole, but without exceeding the maximum data staleness that the clients are willing to accept. Decongestant also adapts well to dynamically changing workloads, obtaining performance benefits when they can arise from use of the secondaries, while ensuring that returned values are fresh enough given client requirements.

Style APA, Harvard, Vancouver, ISO itp.

14

Štajner, Sanja. "New data-driven approaches to text simplification". Thesis, University of Wolverhampton, 2016. http://hdl.handle.net/2436/601113.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

15

Štajner, Sanja. "New data-driven approaches to text simplification". Thesis, University of Wolverhampton, 2015. http://hdl.handle.net/2436/554413.

Pełny tekst źródła

Streszczenie:

Many texts we encounter in our everyday lives are lexically and syntactically very complex. This makes them difficult to understand for people with intellectual or reading impairments, and difficult for various natural language processing systems to process. This motivated the need for text simplification (TS) which transforms texts into their simpler variants. Given that this is still a relatively new research area, many challenges are still remaining. The focus of this thesis is on better understanding the current problems in automatic text simplification (ATS) and proposing new data-driven approaches to solving them. We propose methods for learning sentence splitting and deletion decisions, built upon parallel corpora of original and manually simplified Spanish texts, which outperform the existing similar systems. Our experiments in adaptation of those methods to different text genres and target populations report promising results, thus offering one possible solution for dealing with the scarcity of parallel corpora for text simplification aimed at specific target populations, which is currently one of the main issues in ATS. The results of our extensive analysis of the phrase-based statistical machine translation (PB-SMT) approach to ATS reject the widespread assumption that the success of that approach largely depends on the size of the training and development datasets. They indicate more influential factors for the success of the PB-SMT approach to ATS, and reveal some important differences between cross-lingual MT and the monolingual v MT used in ATS. Our event-based system for simplifying news stories in English (EventSimplify) overcomes some of the main problems in ATS. It does not require a large number of handcrafted simplification rules nor parallel data, and it performs significant content reduction. The automatic and human evaluations conducted show that it produces grammatical text and increases readability, preserving and simplifying relevant content and reducing irrelevant content. Finally, this thesis addresses another important issue in TS which is how to automatically evaluate the performance of TS systems given that access to the target users might be difficult. Our experiments indicate that existing readability metrics can successfully be used for this task when enriched with human evaluation of grammaticality and preservation of meaning.

Style APA, Harvard, Vancouver, ISO itp.

16

Christensen, Kathryn S. "Architectural development and performance analysis of a primary data cache with read miss address prediction capability". Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1998. http://handle.dtic.mil/100.2/ADA349783.

Pełny tekst źródła

Streszczenie:

Thesis (M.S. in Electrical Engineering) Naval Postgraduate School, June 1998.
"June 1998." Thesis advisor(s): Douglas J. Fouts, Frederick Terman. Includes bibliographical references (p. 77). Also available online.

Style APA, Harvard, Vancouver, ISO itp.

17

Sahlin, Kristoffer. "Algorithms and statistical models for scaffolding contig assemblies and detecting structural variants using read pair data". Doctoral thesis, KTH, Beräkningsbiologi, CB, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-173580.

Pełny tekst źródła

Streszczenie:

Advances in throughput from Next Generation Sequencing (NGS) methods has provided new ways to study molecular biology. The increased amount of data enables genome wide scale studies of structural variation, transcription, translation and genome composition. Not only is the scale of each experiment large; lowered cost and faster turn-around has also increased the frequency with which new experiments are conducted. With the data growth comes an increase in demand for efficient and robust algorithms — this is a great computational challenge. The design of computationally efficient algorithms are crucial to cope with the amount of data and it is relatively easy to verify an efficient algorithm by runtime and memory consumption. However, as NGS data comes with several artifacts together with the size the difficulty lies in verifying that the algorithm gives accurate results and are robust to different data sets. This thesis focuses on modeling assumptions of mate-pair and paired-end reads when scaffolding contig assemblies or detecting variants. Both genome assembly and structural variation are difficult problems, partly because of a computationally complex nature of the problems, but also due to various noise and artifacts in input data. Constructing methods that addresses all artifacts and parameters in data is difficult, if not impossible, and end-to-end pipelines often come with several simplifications. Instead of tackling these difficult problems all at once, a large part of this thesis concentrates on smaller problems around scaffolding and structural variation detection. By identifying and modeling parts of the problem where simplifications has been made in other algorithms, we obtain an improved solution to the corresponding full problem. The first paper shows an improved model to estimate gap sizes, hence contig placement, in the scaffolding problem. The second paper introduces a new scaffolder to scaffold large complex genomes and the third paper extends the scaffolding method to account for paired-end-contamination in mate-pair libraries. The fourth paper investigates detection of structural variants using fragment length information and corrects a commonly assumed null-hypothesis distribution used to detect structural variants.

QC 20150915

Style APA, Harvard, Vancouver, ISO itp.

18

Lenz, Lauren Holt. "Statistical Methods to Account for Gene-Level Covariates in Normalization of High-Dimensional Read-Count Data". DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7392.

Pełny tekst źródła

Streszczenie:

The goal of genetic-based cancer research is often to identify which genes behave differently in cancerous and healthy tissue. This difference in behavior, referred to as differential expression, may lead researchers to more targeted preventative care and treatment. One way to measure the expression of genes is though a process called RNA-Seq, that takes physical tissue samples and maps gene products and fragments in the sample back to the gene that created it, resulting in a large read-count matrix with genes in the rows and a column for each sample. The read-counts for tumor and normal samples are then compared in a process called differential expression analysis. However, normalization of these read-counts is a necessary pre-processing step, in order to account for differences in the read-count values due to non-expression related variables. It is common in recent RNA-Seq normalization methods to also account for gene-level covariates, namely gene length in base pairs and GC-content, the proportion of bases in the gene that are Guanine and Cytosine. Here a colorectal cancer RNA-Seq read-count data set comprised of 30,220 genes and 378 samples is examined. Two of the normalization methods that account for gene length and GC-content, CQN and EDASeq, are extended to account for protein coding status as a third gene-level covariate. The binary nature of protein coding status results in unique computation issues. The results of using the normalized read counts from CQN, EDASeq, and four new normalization methods are used for differential expression analysis via the nonparametric Wilcoxon Rank-Sum Test as well as the lme4 pipeline that produces per-gene models based on a negative binomial distribution. The resulting differential expression results are compared for two genes of interest in colorectal cancer, APC and CTNNB1, both of the WNT signaling pathway.

Style APA, Harvard, Vancouver, ISO itp.

19

Tithi, Saima Sultana. "Computational Analysis of Viruses in Metagenomic Data". Diss., Virginia Tech, 2019. http://hdl.handle.net/10919/97194.

Pełny tekst źródła

Streszczenie:

Viruses have huge impact on controlling diseases and regulating many key ecosystem processes. As metagenomic data can contain many microbiomes including many viruses, by analyzing metagenomic data we can analyze many viruses at the same time. The first step towards analyzing metagenomic data is to identify and quantify viruses present in the data. In order to answer this question, we developed a computational pipeline, FastViromeExplorer. FastViromeExplorer leverages a pseudoalignment based approach, which is faster than the traditional alignment based approach to quickly align millions/billions of reads. Application of FastViromeExplorer on both human gut samples and environmental samples shows that our tool can successfully identify viruses and quantify the abundances of viruses quickly and accurately even for a large data set. As viruses are getting increased attention in recent times, most of the viruses are still unknown or uncategorized. To discover novel viruses from metagenomic data, we developed a computational pipeline named FVE-novel. FVE-novel leverages a hybrid of both reference based and de novo assembly approach to recover novel viruses from metagenomic data. By applying FVE-novel to an ocean metagenome sample, we successfully recovered two novel viruses and two different strains of known phages. Analysis of viral assemblies from metagenomic data reveals that viral assemblies often contain assembly errors like chimeric sequences which means more than one viral genomes are incorrectly assembled together. In order to identify and fix these types of assembly errors, we developed a computational tool called VirChecker. Our tool can identify and fix assembly errors due to chimeric assembly. VirChecker also extends the assembly as much as possible to complete it and then annotates the extended and improved assembly. Application of VirChecker to viral scaffolds collected from an ocean meatgenome sample shows that our tool successfully fixes the assembly errors and extends two novel virus genomes and two strains of known phage genomes.
Doctor of Philosophy
Virus, the most abundant micro-organism on earth has a profound impact on human health and environment. Analyzing metagenomic data for viruses has the beneFIt of analyzing many viruses at a time without the need of cultivating them in the lab environment. Here, in this dissertation, we addressed three research problems of analyzing viruses from metagenomic data. To analyze viruses in metagenomic data, the first question needs to answer is what viruses are there and at what quantity. To answer this question, we developed a computational pipeline, FastViromeExplorer. Our tool can identify viruses from metagenomic data and quantify the abundances of viruses present in the data quickly and accurately even for a large data set. To recover novel virus genomes from metagenomic data, we developed a computational pipeline named FVE-novel. By applying FVE-novel to an ocean metagenome sample, we successfully recovered two novel viruses and two strains of known phages. Examination of viral assemblies from metagenomic data reveals that due to the complex nature of metagenome data, viral assemblies often contain assembly errors and are incomplete. To solve this problem, we developed a computational pipeline, named VirChecker, to polish, extend and annotate viral assemblies. Application of VirChecker to virus genomes recovered from an ocean metagenome sample shows that our tool successfully extended and completed those virus genomes.

Style APA, Harvard, Vancouver, ISO itp.

20

Zeng, Shuai, i 曾帥. "Predicting functional impact of nonsynonymous mutations by quantifying conservation information and detect indels using split-read approach". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2014. http://hdl.handle.net/10722/198818.

Pełny tekst źródła

Streszczenie:

The rapidly developing sequencing technology has brought up an opportunity to scientists to look into the detailed genotype information in human genome. Computational programs have played important roles in identifying disease related genomic variants from huge amount of sequencing data. In the past years, a number of computational algorithms have been developed, solving many crucial problems in sequencing data analysis, such as mapping sequencing reads to genome and identifying SNPs. However, many difficult and important issues are still expecting satisfactory solutions. A key challenge is identifying disease related mutations in the background of non-pathogenic polymorphisms. Another crucial problem is detecting INDELs especially the long deletions under the technical limitations of second generation sequencing technology. To predict disease related mutations, we developed a machine learning-based (Random forests) prediction tool, EFIN (Evaluation of Functional Impact of Nonsynonymous mutations). We build A Multiple Sequence Alignment (MSA) for a querying protein with its homologous sequences. MSA is later divided into different blocks according to taxonomic information of the sequences. After that, we quantified the conservation in each block using a number of selected features, for example, entropy, a concept borrowed from information theory. EFIN was trained by Swiss-Prot and HumDiv datasets. By a series of fair comparisons, EFIN showed better results than the widely-used algorithms in terms of AUC (Area under ROC curve), accuracy, specificity and sensitivity. The web-based database is provided to worldwide user at paed.hku.hk/efin. To solve the second problem, we developed Linux-based software, SPLindel that detects deletions (especially long deletions) and insertions using second generation sequencing data. For each sample, SPLindel uses split-read method to detect the candidate INDELs by building alternative references to go along with the reference sequences. And then we remap all the relevant reads using both original references and alternative allele references. A Bayesian model integrating paired-end information was used to assign the reads to the most likely locations on either the original reference allele or the alternative allele. Finally we count the number of reads that support the alternative allele (with insertion or deletions comparing to the original reference allele) and the original allele, and fit a beta-binomial mixture model. Based on this model, the likelihood for each INDEL is calculated and the genotype is predicted. SPLindel runs about the same speed as GATK and DINDEL, but much faster than DINDEL. SPLindel obtained very similar results as GATK and DINDEL for the INDELs of size 1-15 bps, but is much more effective in detecting INDELs of larger size. Using machine learning method and statistical modeling technology, we proposed the tools to solve these two important problems in sequencing data analysis. This study will help identify novel damaging nsSNPs more accurately and efficiently, and equip researcher with more powerful tool in identifying INDELs, especially long deletions. As more and more sequencing data are generated, methods and tools introduced in this thesis may help us extract useful information to facilitate identification of causal mutations to human diseases.
published_or_final_version
Paediatrics and Adolescent Medicine
Doctoral
Doctor of Philosophy

Style APA, Harvard, Vancouver, ISO itp.

21

Söderbäck, Karl. "Organizing HLA data for improved navigation and searchability". Thesis, Linköpings universitet, Databas och informationsteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176029.

Pełny tekst źródła

Streszczenie:

Pitch Technologies specializes their work on the HLA standard, a standard that specifies data exchange between simulators. The company provides a solution for recording HLA data into a database as raw byte data entries. In this thesis, different design solutions to store and organize recorded HLA data in a manner that reflects the content of the data are proposed and implemented, with the aim of making the data possible to query and analyze after recording. The design solutions impact on storage, read- and write performance as well as usability are evaluated through a suite of tests run on a PostgreSQL database and a TimescaleDB database. It is concluded that none of the design alternatives is the best solution for all aspects, but the most promising combination is proposed.

Style APA, Harvard, Vancouver, ISO itp.

22

Esteve, García Albert. "Design of Efficient TLB-based Data Classification Mechanisms in Chip Multiprocessors". Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/86136.

Pełny tekst źródła

Streszczenie:

Most of the data referenced by sequential and parallel applications running in current chip multiprocessors are referenced by a single thread, i.e., private. Recent proposals leverage this observation to improve many aspects of chip multiprocessors, such as reducing coherence overhead or the access latency to distributed caches. The effectiveness of those proposals depends to a large extent on the amount of detected private data. However, the mechanisms proposed so far either do not consider either thread migration or the private use of data within different application phases, or do entail high overhead. As a result, a considerable amount of private data is not detected. In order to increase the detection of private data, this thesis proposes a TLB-based mechanism that is able to account for both thread migration and private application phases with low overhead. Classification status in the proposed TLB-based classification mechanisms is determined by the presence of the page translation stored in other core's TLBs. The classification schemes are analyzed in multilevel TLB hierarchies, for systems with both private and distributed shared last-level TLBs. This thesis introduces a page classification approach based on inspecting other core's TLBs upon every TLB miss. In particular, the proposed classification approach is based on exchange and count of tokens. Token counting on TLBs is a natural and efficient way for classifying memory pages. It does not require the use of complex and undesirable persistent requests or arbitration, since when two ormore TLBs race for accessing a page, tokens are appropriately distributed classifying the page as shared. However, TLB-based ability to classify private pages is strongly dependent on TLB size, as it relies on the presence of a page translation in the system TLBs. To overcome that, different TLB usage predictors (UP) have been proposed, which allow a page classification unaffected by TLB size. Specifically, this thesis introduces a predictor that obtains system-wide page usage information by either employing a shared last-level TLB structure (SUP) or cooperative TLBs working together (CUP).
La mayor parte de los datos referenciados por aplicaciones paralelas y secuenciales que se ejecutan enCMPs actuales son referenciadas por un único hilo, es decir, son privados. Recientemente, algunas propuestas aprovechan esta observación para mejorar muchos aspectos de los CMPs, como por ejemplo reducir el sobrecoste de la coherencia o la latencia de los accesos a cachés distribuidas. La efectividad de estas propuestas depende en gran medida de la cantidad de datos que son considerados privados. Sin embargo, los mecanismos propuestos hasta la fecha no consideran la migración de hilos de ejecución ni las fases de una aplicación. Por tanto, una cantidad considerable de datos privados no se detecta apropiadamente. Con el fin de aumentar la detección de datos privados, proponemos un mecanismo basado en las TLBs, capaz de reclasificar los datos a privado, y que detecta la migración de los hilos de ejecución sin añadir complejidad al sistema. Los mecanismos de clasificación en las TLBs se han analizado en estructuras de varios niveles, incluyendo TLBs privadas y con un último nivel de TLB compartido y distribuido. Esta tesis también presenta un mecanismo de clasificación de páginas basado en la inspección de las TLBs de otros núcleos tras cada fallo de TLB. De forma particular, el mecanismo propuesto se basa en el intercambio y el cuenteo de tokens (testigos). Contar tokens en las TLBs supone una forma natural y eficiente para la clasificación de páginas de memoria. Además, evita el uso de solicitudes persistentes o arbitraje alguno, ya que si dos o más TLBs compiten para acceder a una página, los tokens se distribuyen apropiadamente y la clasifican como compartida. Sin embargo, la habilidad de los mecanismos basados en TLB para clasificar páginas privadas depende del tamaño de las TLBs. La clasificación basada en las TLBs se basa en la presencia de una traducción en las TLBs del sistema. Para evitarlo, se han propuesto diversos predictores de uso en las TLBs (UP), los cuales permiten una clasificación independiente del tamaño de las TLBs. En concreto, esta tesis presenta un sistema mediante el que se obtiene información de uso de página a nivel de sistema con la ayuda de un nivel de TLB compartida (SUP) o mediante TLBs cooperando juntas (CUP).
La major part de les dades referenciades per aplicacions paral·leles i seqüencials que s'executen en CMPs actuals són referenciades per un sol fil, és a dir, són privades. Recentment, algunes propostes aprofiten aquesta observació per a millorar molts aspectes dels CMPs, com és reduir el sobrecost de la coherència o la latència d'accés a memòries cau distribuïdes. L'efectivitat d'aquestes propostes depen en gran mesura de la quantitat de dades detectades com a privades. No obstant això, els mecanismes proposats fins a la data no consideren la migració de fils d'execució ni les fases d'una aplicació. Per tant, una quantitat considerable de dades privades no es detecta apropiadament. A fi d'augmentar la detecció de dades privades, aquesta tesi proposa un mecanisme basat en les TLBs, capaç de reclassificar les dades com a privades, i que detecta la migració dels fils d'execució sense afegir complexitat al sistema. Els mecanismes de classificació en les TLBs s'han analitzat en estructures de diversos nivells, incloent-hi sistemes amb TLBs d'últimnivell compartides i distribuïdes. Aquesta tesi presenta un mecanisme de classificació de pàgines basat en inspeccionar les TLBs d'altres nuclis després de cada fallada de TLB. Concretament, el mecanisme proposat es basa en l'intercanvi i el compte de tokens. Comptar tokens en les TLBs suposa una forma natural i eficient per a la classificació de pàgines de memòria. A més, evita l'ús de sol·licituds persistents o arbitratge, ja que si dues o més TLBs competeixen per a accedir a una pàgina, els tokens es distribueixen apropiadament i la classifiquen com a compartida. No obstant això, l'habilitat dels mecanismes basats en TLB per a classificar pàgines privades depenen de la grandària de les TLBs. La classificació basada en les TLBs resta en la presència d'una traducció en les TLBs del sistema. Per a evitar-ho, s'han proposat diversos predictors d'ús en les TLBs (UP), els quals permeten una classificació independent de la grandària de les TLBs. Específicament, aquesta tesi introdueix un predictor que obté informació d'ús de la pàgina a escala de sistema mitjançant un nivell de TLB compartida (SUP) or mitjançant TLBs cooperant juntes (CUP).
Esteve García, A. (2017). Design of Efficient TLB-based Data Classification Mechanisms in Chip Multiprocessors [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/86136
TESIS

Style APA, Harvard, Vancouver, ISO itp.

23

Triplett, Josh. "Relativistic Causal Ordering A Memory Model for Scalable Concurrent Data Structures". PDXScholar, 2012. https://pdxscholar.library.pdx.edu/open_access_etds/497.

Pełny tekst źródła

Streszczenie:

High-performance programs and systems require concurrency to take full advantage of available hardware. However, the available concurrent programming models force a difficult choice, between simple models such as mutual exclusion that produce little to no concurrency, or complex models such as Read-Copy Update that can scale to all available resources. Simple concurrent programming models enforce atomicity and causality, and this enforcement limits concurrency. Scalable concurrent programming models expose the weakly ordered hardware memory model, requiring careful and explicit enforcement of causality to preserve correctness, as demonstrated in this dissertation through the manual construction of a scalable hash-table item-move algorithm. Recent research on "relativistic programming" aims to standardize the programming model of Read-Copy Update, but thus far these efforts have lacked a generalized memory ordering model, requiring data-structure-specific reasoning to preserve causality. I propose a new memory ordering model, "relativistic causal ordering", which combines the scalabilty of relativistic programming and Read-Copy Update with the simplicity of reader atomicity and automatic enforcement of causality. Programs written for the relativistic model translate to scalable concurrent programs for weakly-ordered hardware via a mechanical process of inserting barrier operations according to well-defined rules. To demonstrate the relativistic causal ordering model, I walk through the straightforward construction of a novel concurrent hash-table resize algorithm, including the translation of this algorithm from the relativistic model to a hardware memory model, and show through benchmarks that the resulting algorithm scales far better than those based on mutual exclusion.

Style APA, Harvard, Vancouver, ISO itp.

24

Otto, Christian. "The mapping task and its various applications in next-generation sequencing". Doctoral thesis, Universitätsbibliothek Leipzig, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-161623.

Pełny tekst źródła

Streszczenie:

The aim of this thesis is the development and benchmarking of computational methods for the analysis of high-throughput data from tiling arrays and next-generation sequencing. Tiling arrays have been a mainstay of genome-wide transcriptomics, e.g., in the identification of functional elements in the human genome. Due to limitations of existing methods for the data analysis of this data, a novel statistical approach is presented that identifies expressed segments as significant differences from the background distribution and thus avoids dataset-specific parameters. This method detects differentially expressed segments in biological data with significantly lower false discovery rates and equivalent sensitivities compared to commonly used methods. In addition, it is also clearly superior in the recovery of exon-intron structures. Moreover, the search for local accumulations of expressed segments in tiling array data has led to the identification of very large expressed regions that may constitute a new class of macroRNAs. This thesis proceeds with next-generation sequencing for which various protocols have been devised to study genomic, transcriptomic, and epigenomic features. One of the first crucial steps in most NGS data analyses is the mapping of sequencing reads to a reference genome. This work introduces algorithmic methods to solve the mapping tasks for three major NGS protocols: DNA-seq, RNA-seq, and MethylC-seq. All methods have been thoroughly benchmarked and integrated into the segemehl mapping suite. First, mapping of DNA-seq data is facilitated by the core mapping algorithm of segemehl. Since the initial publication, it has been continuously updated and expanded. Here, extensive and reproducible benchmarks are presented that compare segemehl to state-of-the-art read aligners on various data sets. The results indicate that it is not only more sensitive in finding the optimal alignment with respect to the unit edit distance but also very specific compared to most commonly used alternative read mappers. These advantages are observable for both real and simulated reads, are largely independent of the read length and sequencing technology, but come at the cost of higher running time and memory consumption. Second, the split-read extension of segemehl, presented by Hoffmann, enables the mapping of RNA-seq data, a computationally more difficult form of the mapping task due to the occurrence of splicing. Here, the novel tool lack is presented, which aims to recover missed RNA-seq read alignments using de novo splice junction information. It performs very well in benchmarks and may thus be a beneficial extension to RNA-seq analysis pipelines. Third, a novel method is introduced that facilitates the mapping of bisulfite-treated sequencing data. This protocol is considered the gold standard in genome-wide studies of DNA methylation, one of the major epigenetic modifications in animals and plants. The treatment of DNA with sodium bisulfite selectively converts unmethylated cytosines to uracils, while methylated ones remain unchanged. The bisulfite extension developed here performs seed searches on a collapsed alphabet followed by bisulfite-sensitive dynamic programming alignments. Thus, it is insensitive to bisulfite-related mismatches and does not rely on post-processing, in contrast to other methods. In comparison to state-of-the-art tools, this method achieves significantly higher sensitivities and performs time-competitive in mapping millions of sequencing reads to vertebrate genomes. Remarkably, the increase in sensitivity does not come at the cost of decreased specificity and thus may finally result in a better performance in calling the methylation rate. Lastly, the potential of mapping strategies for de novo genome assemblies is demonstrated with the introduction of a new guided assembly procedure. It incorporates mapping as major component and uses the additional information (e.g., annotation) as guide. With this method, the complete mitochondrial genome of Eulimnogammarus verrucosus has been successfully assembled even though the sequencing library has been heavily dominated by nuclear DNA. In summary, this thesis introduces algorithmic methods that significantly improve the analysis of tiling array, DNA-seq, RNA-seq, and MethylC-seq data, and proposes standards for benchmarking NGS read aligners. Moreover, it presents a new guided assembly procedure that has been successfully applied in the de novo assembly of a crustacean mitogenome
Diese Arbeit befasst sich mit der Entwicklung und dem Benchmarken von Verfahren zur Analyse von Daten aus Hochdurchsatz-Technologien, wie Tiling Arrays oder Hochdurchsatz-Sequenzierung. Tiling Arrays bildeten lange Zeit die Grundlage für die genomweite Untersuchung des Transkriptoms und kamen beispielsweise bei der Identifizierung funktioneller Elemente im menschlichen Genom zum Einsatz. In dieser Arbeit wird ein neues statistisches Verfahren zur Auswertung von Tiling Array-Daten vorgestellt. Darin werden Segmente als exprimiert klassifiziert, wenn sich deren Signale signifikant von der Hintergrundverteilung unterscheiden. Dadurch werden keine auf den Datensatz abgestimmten Parameterwerte benötigt. Die hier vorgestellte Methode erkennt differentiell exprimierte Segmente in biologischen Daten bei gleicher Sensitivität mit geringerer Falsch-Positiv-Rate im Vergleich zu den derzeit hauptsächlich eingesetzten Verfahren. Zudem ist die Methode bei der Erkennung von Exon-Intron Grenzen präziser. Die Suche nach Anhäufungen exprimierter Segmente hat darüber hinaus zur Entdeckung von sehr langen Regionen geführt, welche möglicherweise eine neue Klasse von macroRNAs darstellen. Nach dem Exkurs zu Tiling Arrays konzentriert sich diese Arbeit nun auf die Hochdurchsatz-Sequenzierung, für die bereits verschiedene Sequenzierungsprotokolle zur Untersuchungen des Genoms, Transkriptoms und Epigenoms etabliert sind. Einer der ersten und entscheidenden Schritte in der Analyse von Sequenzierungsdaten stellt in den meisten Fällen das Mappen dar, bei dem kurze Sequenzen (Reads) auf ein großes Referenzgenom aligniert werden. Die vorliegende Arbeit stellt algorithmische Methoden vor, welche das Mapping-Problem für drei wichtige Sequenzierungsprotokolle (DNA-Seq, RNA-Seq und MethylC-Seq) lösen. Alle Methoden wurden ausführlichen Benchmarks unterzogen und sind in der segemehl-Suite integriert. Als Erstes wird hier der Kern-Algorithmus von segemehl vorgestellt, welcher das Mappen von DNA-Sequenzierungsdaten ermöglicht. Seit der ersten Veröffentlichung wurde dieser kontinuierlich optimiert und erweitert. In dieser Arbeit werden umfangreiche und auf Reproduzierbarkeit bedachte Benchmarks präsentiert, in denen segemehl auf zahlreichen Datensätzen mit bekannten Mapping-Programmen verglichen wird. Die Ergebnisse zeigen, dass segemehl nicht nur sensitiver im Auffinden von optimalen Alignments bezüglich der Editierdistanz sondern auch sehr spezifisch im Vergleich zu anderen Methoden ist. Diese Vorteile sind in realen und simulierten Daten unabhängig von der Sequenzierungstechnologie oder der Länge der Reads erkennbar, gehen aber zu Lasten einer längeren Laufzeit und eines höheren Speicherverbrauchs. Als Zweites wird das Mappen von RNA-Sequenzierungsdaten untersucht, welches bereits von der Split-Read-Erweiterung von segemehl unterstützt wird. Aufgrund von Spleißen ist diese Form des Mapping-Problems rechnerisch aufwendiger. In dieser Arbeit wird das neue Programm lack vorgestellt, welches darauf abzielt, fehlende Read-Alignments mit Hilfe von de novo Spleiß-Information zu finden. Es erzielt hervorragende Ergebnisse und stellt somit eine sinnvolle Ergänzung zu Analyse-Pipelines für RNA-Sequenzierungsdaten dar. Als Drittes wird eine neue Methode zum Mappen von Bisulfit-behandelte Sequenzierungsdaten vorgestellt. Dieses Protokoll gilt als Goldstandard in der genomweiten Untersuchung der DNA-Methylierung, einer der wichtigsten epigenetischen Modifikationen in Tieren und Pflanzen. Dabei wird die DNA vor der Sequenzierung mit Natriumbisulfit behandelt, welches selektiv nicht methylierte Cytosine zu Uracilen konvertiert, während Methylcytosine davon unberührt bleiben. Die hier vorgestellte Bisulfit-Erweiterung führt die Seed-Suche auf einem reduziertem Alphabet durch und verifiziert die erhaltenen Treffer mit einem auf dynamischer Programmierung basierenden Bisulfit-sensitiven Alignment-Algorithmus. Das verwendete Verfahren ist somit unempfindlich gegenüber Bisulfit-Konvertierungen und erfordert im Gegensatz zu anderen Verfahren keine weitere Nachverarbeitung. Im Vergleich zu aktuell eingesetzten Programmen ist die Methode sensitiver und benötigt eine vergleichbare Laufzeit beim Mappen von Millionen von Reads auf große Genome. Bemerkenswerterweise wird die erhöhte Sensitivität bei gleichbleibend guter Spezifizität erreicht. Dadurch könnte diese Methode somit auch bessere Ergebnisse bei der präzisen Bestimmung der Methylierungsraten erreichen. Schließlich wird noch das Potential von Mapping-Strategien für Assemblierungen mit der Einführung eines neuen, Kristallisation-genanntes Verfahren zur unterstützten Assemblierung aufgezeigt. Es enthält Mapping als Hauptbestandteil und nutzt Zusatzinformation (z.B. Annotationen) als Unterstützung. Dieses Verfahren ermöglichte die erfolgreiche Assemblierung des kompletten mitochondrialen Genoms von Eulimnogammarus verrucosus trotz einer vorwiegend aus nukleärer DNA bestehenden genomischen Bibliothek. Zusammenfassend stellt diese Arbeit algorithmische Methoden vor, welche die Analysen von Tiling Array, DNA-Seq, RNA-Seq und MethylC-Seq Daten signifikant verbessern. Es werden zudem Standards für den Vergleich von Programmen zum Mappen von Daten der Hochdurchsatz-Sequenzierung vorgeschlagen. Darüber hinaus wird ein neues Verfahren zur unterstützten Genom-Assemblierung vorgestellt, welches erfolgreich bei der de novo-Assemblierung eines mitochondrialen Krustentier-Genoms eingesetzt wurde

Style APA, Harvard, Vancouver, ISO itp.

25

Lama, Luca. "Development and testing of the atlas ibl rod pre production boards". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2013. http://amslaurea.unibo.it/6283/.

Pełny tekst źródła

Streszczenie:

Il lavoro di questa tesi riguarda principalmente la progettazione, simulazione e test di laboratorio di tre versioni successive di schede VME, chiamate Read Out Driver (ROD), che sono state fabbricate per l'upgrade del 2014 dell'esperimento ATLAS Insertable B-Layer (IBL) al CERN. IBL è un nuovo layer che diverrà parte del Pixel Detector di ATLAS. Questa tesi si compone di una panoramica descrittiva dell'esperimento ATLAS in generale per poi concentrarsi sulla descrizione del layer specifico IBL. Inoltre tratta in dettaglio aspetti fisici e tecnici: specifiche di progetto, percorso realizzativo delle schede e test conseguenti. Le schede sono state dapprima prodotte in due prototipi per testare le prestazioni del sistema. Queste sono state fabbricate al fine di valutare le caratteristiche e prestazioni complessive del sistema di readout. Un secondo lotto di produzione, composto di cinque schede, è stato orientato alla correzione fine delle criticità emerse dai test del primo lotto. Un'indagine fine e approfondita del sistema ha messo a punto le schede per la fabbricazione di un terzo lotto di altre cinque schede. Attualmente la produzione è finita e complessivamente sono state realizzate 20 schede definitive che sono in fase di test. La produzione sarà validata prossimamente e le 20 schede verranno consegnate al CERN per essere inserite nel sistema di acquisizione dati del rivelatore. Al momento, il Dipartimento di Fisica ed Astronomia dell'Università di Bologna è coinvolto in un esperimento a pixel solamente attravers IBL descritto in questa tesi. In conclusione, il lavoro di tesi è stato prevalentemente focalizzato sui test delle schede e sul progetto del firmware necessario per la calibrazione e per la presa dati del rivelatore.

Style APA, Harvard, Vancouver, ISO itp.

26

Westerberg, Ellinor. "Efficient delta based updates for read-only filesystem images : An applied study in how to efficiently update the software of an ECU". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291740.

Pełny tekst źródła

Streszczenie:

This thesis investigates a method for efficiently updating the software of an Electronic Control Unit (ECU) in a car. The patch sent to the car should be as small as possible and optimally only contained the changed part of the software. A popular algorithm for creating the patch is bsdiff. However, it is not made for filesystem images, but for binaries. Therefore, an alternative is investigated. The alternative algorithm method is based on the update engine in Android. A standalone version of the Android A/B Update is implemented and compared to bsdiff, in the aspect of the time it takes to generate the patch and the size of the patch. The result shows that bsdiff generates a slightly smaller patch. However, bsdiff is also a lot slower at generating the patch. Furthermore, the time increases linearithmic with the size of the filesystem image. This gives reason to believe that the Android A/B Update algorithm might be a better solution when updating an ECU that contains a full filesystem. However, this depends on if it is most valuable that the patch is as small as possible, or that the process of generating it is fast.
Detta examensarbete undersöker en metod för att effektivt uppdatera mjukvaran i en styrenhet i en bil. En patch som skickas till en bil ska vara så liten som möjligt och helst enbart innehålla de delar av mjukvaran som ändrats. En populär algorithm för att skapa en sådan patch är bsdiff. Den är dock inte gjord för filsystemsavbildningar, utan för binärer. Därför studeras här ett alternativ. Denna alternativa metod är baserad på Androids updateringsprocess. En fristående variant av Android A/B Update är implementerad och och jämförd med bsdiff, med avseende på tiden det tar att generera en patch och storleken av den. Resultatet visar att bsdiff genererar mindre patchar. Däremot är bsdiff också betydligt långsammare. Vidare ökar tiden linearitmisk då storleken på patchen ökar. Detta innebär att Android A/B Update kan vara en bättre lösning för att updatera en styrenhet som innehåller ett filsystem. Det beror dock på vad som värderas högst; en mindre patch eller att processen att skapa patchen ska vara snabbare.

Style APA, Harvard, Vancouver, ISO itp.

27

Pitz, Nora [Verfasser], Harald [Akademischer Betreuer] Appelshäuser i Christoph [Akademischer Betreuer] Blume. "Gas system, gas quality monitor and detector control of the ALICE Transition Radiation Detector and studies for a pre-trigger data read-out system / Nora Pitz. Gutachter: Harald Appelshäuser ; Christoph Blume". Frankfurt am Main : Univ.-Bibliothek Frankfurt am Main, 2012. http://d-nb.info/1044412801/34.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

28

Ayyad, Majed. "Real-Time Event Centric Data Integration". Doctoral thesis, University of Trento, 2014. http://eprints-phd.biblio.unitn.it/1353/1/REAL-TIME_EVENT_CENTRIC_DATA_INTEGRATION.pdf.

Pełny tekst źródła

Streszczenie:

A vital step in integrating data from multiple sources is detecting and handling duplicate records that refer to the same real-life entity. Events are spatio-temporal entities that reflect changes in real world and are received or captured from different sources (sensors, mobile phones, social network services, etc.). In many real world situations, detecting events mostly take place through multiple observations by different observers. The local view of the observer reflects only a partial knowledge with certain granularity of time and space. Observations occur at a particular place and time, however events which are inferred from observations, range over time and space. In this thesis, we address the problem of event matching, which is the task of detecting similar events in the recent past from their observations. We focus on detecting Hyperlocal events, which are an integral part of any dynamic human decision-making process and are useful for different multi-tier responding agencies such as emergency medical services, public safety and law enforcement agencies, organizations working on fusing news from different sources as well as for citizens. In an environment where continuous monitoring and processing is required, the matching task imposes different challenges. In particular, the matching task is decomposed into four separate tasks in which each requiring different computational method. The four tasks are: event-type similarity, similarity in location, similarity in time and thematic role similarity that handles participants similarity. We refer to the four tasks as local similarities. Then in addition, a global similarity measure combines the four tasks before being able to cluster and handle them in a robust near real-time system. We address the local similarity by studying thoroughly existing similarity measures and propose suitable similarity for each task. We utilize ideas from semantic web, qualitative spatial reasoning, fuzzy set and structural alignment similarities in order to define local similarity measures. Then we address the global similarity by treating the problem as a relational learning problem and use machine learning to learn the weights of each local similarity. To learn the weights, we combine the features of each pair of events into one object and use logistic regression and support vector machines to learn the weights. The learned weighted function is tested and evaluated on real dataset which is used to predict the similarity class of the new streamed event

Style APA, Harvard, Vancouver, ISO itp.

29

Frick, Kolmyr Sara, i Thingvall Katarina Juhlin. "Samhällsinformation för alla? : Hur man anpassar ett informationsmaterial till både en lässvag och lässtark målgrupp". Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-1521.

Pełny tekst źródła

Streszczenie:

Syftet med vårt examensarbete är att skapa ett informationsmaterial för nyinflyttade i Västerås kommun som så många som möjligt ska kunna ta del av. Vårt mål är att både lässvaga och lässtarka människor ska känna sig lika betydelsefulla när de läser informationsmaterialet. Vår forskningsfråga är därför: Går det att kombinera principerna för informationsdesign och principerna för lättläst för att nå både en lässvag och en lässtark grupp så att båda känner sig lika viktiga och tar till sig budskapet? Och om det går, hur kan man då gå tillväga? Vår uppdragsgivare är Centrum för lättläst som arbetar med att göra information tillgänglig för dem som har svårt att läsa. De behöver ett utprovat referensmaterial av ett lättillgängligt informationsmaterial som de kan visa upp för olika kommuner i sitt arbete mot ett tillgängligt samhälle. För att besvara vår forskningsfråga använder vi oss av litteraturstudier, samt kvalitativa och kvantitativa studier i form av utprovningar. Resultaten från utprovningarna är avgörande för hur informationsmaterialet bearbetas och färdigställs. Vår slutsats för examensarbetet är att principerna för informationsdesign och principerna för lättläst är en bra kombination för att göra ett informationsmaterial tillgängligt. För att veta hur tillgängligt ett material är krävs utprovningar på olika målgrupper. Varje målgrupp har sina egna behov och det är omöjligt att tillfredställa alla med ett och samma informationsmaterial. Våra utprovningar visar däremot att det går att nå flera olika målgrupper med ett och samma informationsmaterial.

Style APA, Harvard, Vancouver, ISO itp.

30

Hernane, Soumeya-Leila. "Modèles et algorithmes de partage de données cohérents pour le calcul parallèle distribué à haut débit". Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0042/document.

Pełny tekst źródła

Streszczenie:

Data Handover est une librairie de fonctions adaptée aux systèmes distribués à grande échelle. Dho offre des routines qui permettent d'acquérir des ressources en lecture ou en écriture de façon cohérente et transparente pour l'utilisateur. Nous avons modélisé le cycle de vie de Dho par un automate d'état fini puis, constaté expérimentalement, que notre approche produit un recouvrement entre le calcul de l'application et le contrôle de la donnée. Les expériences ont été menées en mode simulé en utilisant la libraire GRAS de SimGrid puis, en exploitant un environnement réel sur la plate-forme Grid'5000. Par la théorie des files d'attente, la stabilité du modèle a été démontrée dans un contexte centralisé. L'algorithme distribué d'exclusion mutuelle de Naimi et Tréhel a été enrichi pour offrir les fonctionnalités suivantes: (1) Permettre la connexion et la déconnexion des processus (ADEMLE), (2) admettre les locks partagés (AEMLEP) et enfin (3) associer les deux propriétés dans un algorithme récapitulatif (ADEMLEP). Les propriétés de sûreté et de vivacité ont été démontrées théoriquement. Le système peer-to-peer proposé combine nos algorithmes étendus et le modèle originel Dho. Les gestionnaires de verrou et de ressource opèrent et interagissent mutuellement dans une architecture à trois niveaux. Suite à l'étude expérimentale du système sous-jacent menée sur Grid'5000, et des résultats obtenus, nous avons démontré la performance et la stabilité du modèle Dho face à une multitude de paramètres
Data Handover is a library of functions adapted to large-scale distributed systems. It provides routines that allow acquiring resources in reading or writing in the ways that are coherent and transparent for users. We modelled the life cycle of Dho by a finite state automaton and through experiments; we have found that our approach produced an overlap between the calculation of the application and the control of the data. These experiments were conducted both in simulated mode and in real environment (Grid'5000). We exploited the GRAS library of the SimGrid toolkit. Several clients try to access the resource concurrently according the client-server paradigm. By the theory of queues, the stability of the model was demonstrated in a centralized environment. We improved, the distributed algorithm for mutual exclusion (of Naimi and Trehel), by introducing following features: (1) Allowing the mobility of processes (ADEMLE), (2) introducing shared locks (AEMLEP) and finally (3) merging both properties cited above into an algorithm summarising (ADEMLEP). We proved the properties, safety and liveliness, theoretically for all extended algorithms. The proposed peer-to-peer system combines our extended algorithms and original Data Handover model. Lock and resource managers operate and interact each other in an architecture based on three levels. Following the experimental study of the underlying system on Grid'5000, and the results obtained, we have proved the performance and stability of the model Dho over a multitude of parameters

Style APA, Harvard, Vancouver, ISO itp.

31

Engel, Heiko [Verfasser], Udo [Gutachter] Kebschull i Lars [Gutachter] Hedrich. "Development of a read-out receiver card for fast processing of detector data : ALICE HLT run 2 readout upgrade and evaluation of dataflow hardware description for high energy physics readout applications / Heiko Engel ; Gutachter: Udo Kebschull, Lars Hedrich". Frankfurt am Main : Universitätsbibliothek Johann Christian Senckenberg, 2019. http://d-nb.info/1192372166/34.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

32

Fujimoto, Masaki Stanley. "Graph-Based Whole Genome Phylogenomics". BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8461.

Pełny tekst źródła

Streszczenie:

Understanding others is a deeply human urge basic in our existential quest. It requires knowing where someone has come from and where they sit amongst peers. Phylogenetic analysis and genome wide association studies seek to tell us where we’ve come from and where we are relative to one another through evolutionary history and genetic makeup. Current methods do not address the computational complexity caused by new forms of genomic data, namely long-read DNA sequencing and increased abundances of assembled genomes, that are becoming evermore abundant. To address this, we explore specialized data structures for storing and comparing genomic information. This work resulted in the creation of novel data structures for storing multiple genomes that can be used for identifying structural variations and other types of polymorphisms. Using these methods we illuminate the genetic history of organisms in our efforts to understand the world around us.

Style APA, Harvard, Vancouver, ISO itp.

33

Macias, Filiberto. "Real Time Telemetry Data Processing and Data Display". International Foundation for Telemetering, 1996. http://hdl.handle.net/10150/611405.

Pełny tekst źródła

Streszczenie:

International Telemetering Conference Proceedings / October 28-31, 1996 / Town and Country Hotel and Convention Center, San Diego, California
The Telemetry Data Center (TDC) at White Sands Missile Range (WSMR) is now beginning to modernize its existing telemetry data processing system. Modern networking and interactive graphical displays are now being introduced. This infusion of modern technology will allow the TDC to provide our customers with enhanced data processing and display capability. The intent of this project is to outline this undertaking.

Style APA, Harvard, Vancouver, ISO itp.

34

Kalinda, Mkenda Beatrice. "Essays on purchasing power parity, real exchange rate, and optimum currency areas /". Göteborg : Nationalekonomiska institutionen, Handelshögsk, 2000. http://www.handels.gu.se/epc/data/html/html/1973.html.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

35

Ostroumov, Ivan Victorovich. "Real time sensors data processing". Thesis, Polit. Challenges of science today: XIV International Scientific and Practical Conference of Young Researchers and Students, April 2–3, 2014 : theses. – К., 2014. – 35p, 2014. http://er.nau.edu.ua/handle/NAU/26582.

Pełny tekst źródła

Streszczenie:

Sensor it is the most powerful part of any system. Aviation industry is the plase where milions of sensors is be used for difetrent purpuses. Othe wery important task of avionics equipment is data transfer between sensors to processing equipment. Why it is so important to transmit data online into MatLab? Nowadays rapidly are developing unmanned aerial vehicles. If we can transmit data from UAV sensors into MatLab, then we can process it and get the desired information about UAV. Of course we have to use the most chipiest way to data transfer. Today everyone in the world has mobile phone. Many of them has different sensors, such as: pressure sensor, temperature sensor, gravity sensor, gyroscope, rotation vector sensor, proximity sensor, light sensor, orientation sensor, magnetic field sensor, accelerometer, GPS receiver and so on. It will be cool if we can use real time data from cell phone sensors for some navigation tasks. In our work we use mobile phone Samsung Galaxy SIII with all sensors which are listed above except temperature sensor. There are existing many programs for reading and displaying data from sensors, such as: “Sensor Kinetics”, “Sensors”, “Data Recording”, “Android Sensors Viewer”. We used “Data Recording”. For the purpose of transmitting data from cell phone there are following methods: - GPRS (Mobile internet); - Bluetooth; - USB cable; - Wi-Fi. After comparing this methods we analyzed that GPRS is uncomfortable for us because we should pay for it, Bluetooth has small coverage, USB cable has not such portability as others methods. So we decided that Wi-Fi is optimal method on transmitting data for our goal

Style APA, Harvard, Vancouver, ISO itp.

36

Maiga, Aïssata, i Johanna Löv. "Real versus Simulated data for Image Reconstruction : A comparison between training with sparse simulated data and sparse real data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302028.

Pełny tekst źródła

Streszczenie:

Our study investigates how training with sparse simulated data versus sparse real data affects image reconstruction. We compared on several criteria such as number of events, speed and high dynamic range, HDR. The results indicate that the difference between simulated data and real data is not large. Training with real data performed often better, but only by 2%. The findings confirm what earlier studies have shown; training with simulated data generalises well, even when training on sparse datasets as this study shows.
Vår studie undersöker hur träning med gles simulerad data och gles verklig data från en eventkamera, påverkar bildrekonstruktion. Vi tränade två modeller, en med simulerad data och en med verklig för att sedan jämföra dessa på ett flertal kriterier som antal event, hastighet och high dynamic range, HDR. Resultaten visar att skillnaden mellan att träna med simulerad data och verklig data inte är stor. Modellen tränad med verklig data presterade bättre i de flesta fall, men den genomsnittliga skillnaden mellan resultaten är bara 2%. Resultaten bekräftar vad tidigare studier har visat; träning med simulerad data generaliserar bra, och som denna studie visar även vid träning på glesa datamängder.

Style APA, Harvard, Vancouver, ISO itp.

37

Jafar, Fatmeh Nazmi Ahmad. "Simulating traditional traffic data from satellite data-preparing for real satellite data test /". The Ohio State University, 2000. http://rave.ohiolink.edu/etdc/view?acc_num=osu1488193665235894.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

38

Kilpatrick, Stephen, Galen Rasche, Chris Cunningham, Myron Moodie i Ben Abbott. "REORDERING PACKET BASED DATA IN REAL-TIME DATA ACQUISITION SYSTEMS". International Foundation for Telemetering, 2007. http://hdl.handle.net/10150/604571.

Pełny tekst źródła

Streszczenie:

ITC/USA 2007 Conference Proceedings / The Forty-Third Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2007 / Riviera Hotel & Convention Center, Las Vegas, Nevada
Ubiquitous internet protocol (IP) hardware has reached performance and capability levels that allow its use in data collection and real-time processing applications. Recent development experience with IP-based airborne data acquisition systems has shown that the open, pre-existing IP tools, standards, and capabilities support this form of distribution and sharing of data quite nicely, especially when combined with IP multicast. Unfortunately, the packet based nature of our approach also posed some problems that required special handling to achieve performance requirements. We have developed methods and algorithms for the filtering, selecting, and retiming problems associated with packet-based systems and present our approach in this paper.

Style APA, Harvard, Vancouver, ISO itp.

39

Ng, Sunny, Mei Y. Wei, Austin Somes, Mich Aoyagi i Joe Leung. "REAL-TIME DATA SERVER-CLIENT SYSTEM FOR THE NEAR REAL-TIME RESEARCH ANALYSIS OF ENSEMBLE DATA". International Foundation for Telemetering, 1998. http://hdl.handle.net/10150/609671.

Pełny tekst źródła

Streszczenie:

International Telemetering Conference Proceedings / October 26-29, 1998 / Town & Country Resort Hotel and Convention Center, San Diego, California
This paper describes a distributed network client-server system developed for researchers to perform real-time or near-real-time analyses on ensembles of telemetry data previously done in post-flight. The client-server software approach provides extensible computing and real-time access to data at multiple remote client sites. Researchers at remote sites can share similar information as those at the test site. The system has been used successfully in numerous commercial, academic and NASA wide aircraft flight testing.

Style APA, Harvard, Vancouver, ISO itp.

40

Karlsson, Anders. "Presentation of Real-Time TFR-data". Thesis, Högskolan i Gävle, Avdelningen för Industriell utveckling, IT och Samhällsbyggnad, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-17228.

Pełny tekst źródła

Streszczenie:

In a high-voltage direct current system (HVDC), a process is continuously recording data (e.g., voltage sampling). To access this data, the operator must frst set triggers, then wait for the transient- fault-recording (TFR) to complete, and fnally open the recorded fle for analysis. A more refned solution is required. It should allow the operator to select, add and watch the measuring points in real-time. Thus, the purpose of this thesis work is to read, process and present the samples continuously, as series in a graphical chart. Programming will be done by using iterative and incremental development. The result of this thesis work has been an executable application with a graphical user interface (GUI), able to showing the content in the buffers as a graph in a chart.

Style APA, Harvard, Vancouver, ISO itp.

41

Achtzehnter, Joachim, i Preston Hauck. "REAL-TIME TENA-ENABLED DATA GATEWAY". International Foundation for Telemetering, 2004. http://hdl.handle.net/10150/605318.

Pełny tekst źródła

Streszczenie:

International Telemetering Conference Proceedings / October 18-21, 2004 / Town & Country Resort, San Diego, California
This paper describes the TENA architecture, which has been proposed by the Foundation Initiative 2010 (FI 2010) project as the basis for future US Test Range software systems. The benefits of this new architecture are explained by comparing the future TENA-enabled range infrastructure with the current situation of largely non-interoperable range resources. Legacy equipment and newly acquired off-the-shelf equipment that does not directly support TENA can be integrated into a TENA environment using TENA Gateways. This paper focuses on issues related to the construction of such gateways, including the important issue of real-time requirements when dealing with real-world data acquisition instruments. The benefits of leveraging commercial off-the-shelf (COTS) Data Acquisition Systems that are based on true real-time operating systems are discussed in the context of TENA Gateway construction.

Style APA, Harvard, Vancouver, ISO itp.

42

White, Allan P., i Richard K. Dean. "Real-Time Test Data Processing System". International Foundation for Telemetering, 1989. http://hdl.handle.net/10150/614650.

Pełny tekst źródła

Streszczenie:

International Telemetering Conference Proceedings / October 30-November 02, 1989 / Town & Country Hotel & Convention Center, San Diego, California
The U.S. Army Aviation Development Test Activity at Fort Rucker, Alabama needed a real-time test data collection and processing capability for helicopter flight testing. The system had to be capable of collecting and processing both FM and PCM data streams from analog tape and/or a telemetry receiver. The hardware and software was to be off the shelf whenever possible. The integration was to result in a stand alone telemetry collection and processing system.

Style APA, Harvard, Vancouver, ISO itp.

43

Toufie, Moegamat Zahir. "Real-time loss-less data compression". Thesis, Cape Technikon, 2000. http://hdl.handle.net/20.500.11838/1367.

Pełny tekst źródła

Streszczenie:

Thesis (MTech (Information Technology))--Cape Technikon, Cape Town, 2000
Data stored on disks generally contain significant redundancy. A mechanism or algorithm that recodes the data to lessen the data size could possibly double or triple the effective data that could be stored on the media. One mechanism of doing this is by data compression. Many compression algorithms currently exist, but each one has its own advantages as well as disadvantages. The objective of this study', to formulate a new compression algorithm that could be implemented in a real-time mode in any file system. The new compression algorithm should also execute as fast as possible, so as not to cause a lag in the file systems performance. This study focuses on binary data of any type, whereas previous articles such as (Huftnlan. 1952:1098), (Ziv & Lempel, 1977:337: 1978:530), (Storer & Szymanski. 1982:928) and (Welch, 1984:8) have placed particular emphasis on text compression in their discussions of compression algorithms for computer data. The resulting compression algorithm that is formulated by this study is Lempel-Ziv-Toutlc (LZT). LZT is basically an LZ77 (Ziv & Lempel, 1977:337) encoder with a buffer size equal in size to that of the data block of the file system in question. LZT does not make this distinction, it discards the sliding buffer principle and uses each data block of the entire input stream. as one big buffer on which compression can be performed. LZT also handles the encoding of a match slightly different to that of LZ77. An LZT match is encoded by two bit streams, the first specifying the position of the match and the other specifying the length of the match. This combination is commonly referred to as a pair. To encode the position portion of the pair, we make use of a sliding scale method. The sliding scale method works as follows. Let the position in the input buffer, of the current character to be compressed be held by inpos, where inpos is initially set to 3. It is then only possible for a match to occur at position 1 or 2. Hence the position of a match will never be greater than 2, and therefore the position portion can be encoded using only 1 bit. As "inpos" is incremented as each character is encoded, the match position range increases and therefore more bits will be required to encode the match position. The reason why a decimal 2 can be encoded 'sing only I bit can be explained as follows. When decimal values are converted to binary values, we get 010 = 02, 110 = 12, 210, = 102etc. As a position of 0 will never be used, it is possible to develop a coding scheme where a decimal value of 1 can be represented by a binary value of 0, and a decimal value of 2 can be represented by binary value of 1. Only I bit is therefore needed to encode match position I and match position 2. In general. any decimal value n ca:) be represented by the binary equivalent for (n - 1). The number of bits needed to encode (n - 1), indicates the number of bits needed to encode the match position. The length portion of the pair is encoded using a variable length coding (vlc) approach. The vlc method performs its encoding by using binary blocks. The first binary block is 3 bits long, where binary values 000 through 110 represent decimal values I through 7.

Style APA, Harvard, Vancouver, ISO itp.

44

Jelecevic, Edin, i Thong Nguyen Minh. "VISUALIZE REAL-TIME DATA USING AUTOSAR". Thesis, Örebro universitet, Institutionen för naturvetenskap och teknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-76618.

Pełny tekst źródła

Streszczenie:

Today there are more cars on the road than ever before, the automotive industry is continually expanding and adapting to meet the need of new technologies. To improve complexity management and reduce engineering time and cost Automotive Open System Architecture, better known as AUTOSAR, has been introduced which aims to standardize Electronic Control Units (ECUs) . Today the AUTOSAR standardization is used for the automotive industry, the purpose of the thesis is to investigate whether the standard can be used for something which has no direct connection to the automotive industry. The work done in the report is giving background information about AUTOSAR and the project is using AUTOSAR to visualize real-time data from the web on a LED-sheet. In this project a physical visualization board has been created and the code was written within the integrated software development environment Arctic Studio and its tools, the visualization board will be used at the ARCCORE office in Linköping.
Idag finns det fler bilar på vägarna än någonsin tidigare, fordonsindustrin expanderar ständigt och anpassar sig efter behovet av ny teknik. För att förbättra komplexitets hanteringen och minska tillverkningstider och kostnader så har Automotive Open System Architecture, närmare känt som AUTOSAR, införts som har målet att standardisera elektroniska styrenheter (ECUn). Idag används AUTOSAR-standardiseringen för fordonsindustrin, vad projektet kommer att utforska är att se om man kan använda standarden för något som inte har någon direkt koppling till fordonsindustrin. Rapporten ger en förklaring om AUTOSAR, i projektet så används AUTOSAR för att visualisera realtidsdata från webben på en LED-karta. I detta projekt har en fysisk visualiseringstavla skapats där kod skrevs inom den integrerade mjukvaruutvecklingsmiljön Arctic Studio, visualiseringstavlan kommer att användas på ARCCOREs egna kontor i Linköping.

Style APA, Harvard, Vancouver, ISO itp.

45

Ayyad, Majed. "Real-Time Event Centric Data Integration". Doctoral thesis, Università degli studi di Trento, 2014. https://hdl.handle.net/11572/367750.

Pełny tekst źródła

Streszczenie:

A vital step in integrating data from multiple sources is detecting and handling duplicate records that refer to the same real-life entity. Events are spatio-temporal entities that reflect changes in real world and are received or captured from different sources (sensors, mobile phones, social network services, etc.). In many real world situations, detecting events mostly take place through multiple observations by different observers. The local view of the observer reflects only a partial knowledge with certain granularity of time and space. Observations occur at a particular place and time, however events which are inferred from observations, range over time and space. In this thesis, we address the problem of event matching, which is the task of detecting similar events in the recent past from their observations. We focus on detecting Hyperlocal events, which are an integral part of any dynamic human decision-making process and are useful for different multi-tier responding agencies such as emergency medical services, public safety and law enforcement agencies, organizations working on fusing news from different sources as well as for citizens. In an environment where continuous monitoring and processing is required, the matching task imposes different challenges. In particular, the matching task is decomposed into four separate tasks in which each requiring different computational method. The four tasks are: event-type similarity, similarity in location, similarity in time and thematic role similarity that handles participants similarity. We refer to the four tasks as local similarities. Then in addition, a global similarity measure combines the four tasks before being able to cluster and handle them in a robust near real-time system. We address the local similarity by studying thoroughly existing similarity measures and propose suitable similarity for each task. We utilize ideas from semantic web, qualitative spatial reasoning, fuzzy set and structural alignment similarities in order to define local similarity measures. Then we address the global similarity by treating the problem as a relational learning problem and use machine learning to learn the weights of each local similarity. To learn the weights, we combine the features of each pair of events into one object and use logistic regression and support vector machines to learn the weights. The learned weighted function is tested and evaluated on real dataset which is used to predict the similarity class of the new streamed event.

Style APA, Harvard, Vancouver, ISO itp.

46

Eriksson, Ruth, i Miranda Luis Galaz. "Ett digitalt läromedel för barn med lässvårigheter". Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-189205.

Pełny tekst źródła

Streszczenie:

Den digitala tidsåldern förändrar samhället. Ny teknik ger möjligheter att framställa och organisera kunskap på nya sätt. Tekniken som finns i skolan i dag, kan även utnyttjas till att optimera lästräningen till elever med lässvårigheter. Denna avhandling undersöker hur ett digitalt läromedel för läsinlärning för barn med lässvårigheter kan designas och implementeras, och visar att detta är möjligt att genomföra. Ett digitalt läromedel av bra kvalitet måste utgå ifrån en vetenskapligt vedertagen läsinlärningsmetod. Denna avhandling utgår ifrån Gunnel Wendicks modell, som redan används av många specialpedagoger. Modellen används dock i sin ursprungsform, med papperslistor med ord, utan datorer, surfplattor eller liknande. Vi analyserar Wendick-modellen, och tillämpar den på ett kreativt sätt för att designa en digital motsvarighet till det ursprungliga arbetssättet. Vårt mål är att skapa ett digitalt läromedel som implementerar Wendick-modellen, och på så sätt göra det möjligt att modellen används på olika smarta enheter. Med detta hoppas vi kunna underlätta arbetet både för specialpedagoger och barn med lässvårigheter, samt göra rutinerna mer tilltalande och kreativa. I vår studie undersöker vi olika tekniska möjligheter för att implementera Wendick-modellen. Vi väljer att skapa en prototyp av en webbapplikation, med passande funktionalitet för både administratörer, specialpedagoger och elever. Prototypens funktionalitet kan delas upp i två delar, den administrativa delen och övningsdelen. Den administrativa delen omfattar användargränssnitt och funktionalitet för hantering av elever och andra relevanta uppgifter. Övningsdelen omfattar övningsvyer och deras funktionalitet. Övningarnas funktionalitet är tänkt för att träna den auditiva kanalen, den fonologiska avkodningen - med målet att läsa rätt, samt den ortografiska avkodningen - med målet att eleven ska automatisera sin avkodning, d.v.s. att uppfatta orden som en bild. I utvecklandet av det digitala läromedlet används beprövade principer inom mjukvaruteknik och beprövade implementationstekniker. Man sammanställer högnivåkrav, modellerar domänen och definierar passande användningsfall. För att implementera applikationen används Java EE plattform, Web speech API, Primefaces specifikationer, och annat. Vår prototyp är en bra början som inspirerar till vidare utveckling, med förhoppning om att en fullständig webapplikation ska skapas, som ska förändra arbetssättet i våra skolor.
The digital age is changing society. New technology provides opportunities to produce and organize knowledge in new ways. The technology available in schools today can also be used to optimize literacy training for students with reading difficulties. This thesis examines how a digital teaching material for literacy training for children with reading difficulties can be designed and implemented, and shows that this is possible to achieve. A digital learning material of good quality should be based on a scientifically accepted method of literacy training. This thesis uses Gunnel Wendick’s training model which is already used by many special education teachers. The training model is used with word lists, without computers, tablets or the like. We analyze Wendick’s training model and employ it, in a creative way, to design a digital equivalent to the original model. Our goal is to create a digital learning material that implements Wendick’s training model, and thus make it possible to use in various smart devices. With this we hope to facilitate the work of both the special education teachers and children with reading difficulties and to make the procedures more appealing and creative. In our study, we examine various technical possibilities to implement Wendick’s training model. We choose to create a prototype of a web application, with suitable functionality for both administrators, special education teachers and students. The prototype’s functionality can be divided into two parts, the administrative part and the exercise part. The administrative part covers the user interface and functionality for handling students and other relevant data. The exercise part includes training views and their functionality. The functionality of the exercises is intended to train the auditory channel, the phonological awarenesswith the goal of reading accurately, and the orthographic decoding - with the goal that students should automate their decoding, that is, to perceive the words as an image. In the development of the digital teaching material, we used proven principles in software technologies and proven implementation techniques. It compiles high-level requirements, the domain model and defines the appropriate use cases. To implement the application, we used the Java EE platform, Web Speech API, Prime Faces specifications, and more. Our prototype is a good start to inspire further development, with the hope that a full web application will be created, that will transform the practices in our schools.

Style APA, Harvard, Vancouver, ISO itp.

47

Bennion, Laird. "Identifying data center supply and demand". Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/103457.

Pełny tekst źródła

Streszczenie:

Thesis: S.M. in Real Estate Development, Massachusetts Institute of Technology, Program in Real Estate Development in conjunction with the Center for Real Estate, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 66-69).
This thesis documents new methods for gauging supply and demand of data center capacity and addresses issues surrounding potential threats to data center demand. This document is divided between a primer on the composition and engineering of a current data center, discussion of issues surrounding data center demand, Moore's Law and cloud computing, and then transitions to presentation of research on data center demand and supply.
by Laird Bennion.
S.M. in Real Estate Development

Style APA, Harvard, Vancouver, ISO itp.

48

Tidball, John E. "REAL-TIME HIGH SPEED DATA COLLECTION SYSTEM WITH ADVANCED DATA LINKS". International Foundation for Telemetering, 1997. http://hdl.handle.net/10150/609754.

Pełny tekst źródła

Streszczenie:

International Telemetering Conference Proceedings / October 27-30, 1997 / Riviera Hotel and Convention Center, Las Vegas, Nevada
The purpose of this paper is to describe the development of a very high-speed instrumentation and digital data recording system. The system converts multiple asynchronous analog signals to digital data, forms the data into packets, transmits the packets across fiber-optic lines and routes the data packets to destinations such as high speed recorders, hard disks, Ethernet, and data processing. This system is capable of collecting approximately one hundred megabytes per second of filtered packetized data. The significant system features are its design methodology, system configuration, decoupled interfaces, data as packets, the use of RACEway data and VME control buses, distributed processing on mixedvendor PowerPCs, real-time resource management objects, and an extendible and flexible configuration.

Style APA, Harvard, Vancouver, ISO itp.

49

Cai, Simin. "Systematic Design of Data Management for Real-Time Data-Intensive Applications". Licentiate thesis, Mälardalens högskola, Inbyggda system, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35369.

Pełny tekst źródła

Streszczenie:

Modern real-time data-intensive systems generate large amounts of data that are processed using complex data-related computations such as data aggregation. In order to maintain the consistency of data, such computations must be both logically correct (producing correct and consistent results) and temporally correct (completing before specified deadlines). One solution to ensure logical and temporal correctness is to model these computations as transactions and manage them using a Real-Time Database Management System (RTDBMS). Ideally, depending on the particular system, the transactions are customized with the desired logical and temporal correctness properties, which are achieved by the customized RTDBMS with appropriate run-time mechanisms. However, developing such a data management solution with provided guarantees is not easy, partly due to inadequate support for systematic analysis during the design. Firstly, designers do not have means to identify the characteristics of the computations, especially data aggregation, and to reason about their implications. Design flaws might not be discovered, and thus they may be propagated to the implementation. Secondly, trade-off analysis of conflicting properties, such as conflicts between transaction isolation and temporal correctness, is mainly performed ad-hoc, which increases the risk of unpredictable behavior. In this thesis, we propose a systematic approach to develop transaction-based data management with data aggregation support for real-time systems. Our approach includes the following contributions: (i) a taxonomy of data aggregation, (ii) a process for customizing transaction models and RTDBMS, and (iii) a pattern-based method of modeling transactions in the timed automata framework, which we show how to verify with respect to transaction isolation and temporal correctness. Our proposed taxonomy of data aggregation processes helps in identifying their common and variable characteristics, based on which their implications can be reasoned about. Our proposed process allows designers to derive transaction models with desired properties for the data-related computations from system requirements, and decide the appropriate run-time mechanisms for the customized RTDBMS to achieve the desired properties. To perform systematic trade-off analysis between transaction isolation and temporal correctness specifically, we propose a method to create formal models of transactions with concurrency control, based on which the isolation and temporal correctness properties can be verified by model checking, using the UPPAAL tool. By applying the proposed approach to the development of an industrial demonstrator, we validate the applicability of our approach.
DAGGERS

Style APA, Harvard, Vancouver, ISO itp.

50

Park, Sun Jung Park S. M. Massachusetts Institute of Technology. "Data science strategies for real estate development". Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/129099.

Pełny tekst źródła

Streszczenie:

Thesis: S.M. in Real Estate Development, Massachusetts Institute of Technology, Program in Real Estate Development in conjunction with the Center for Real Estate, September, 2020
Cataloged from student-submitted PDF of thesis.
Includes bibliographical references (pages 43-45).
Big data and the increasing usage of data science is changing the way the real estate industry is functioning. From pricing estimates and valuation to marketing and leasing, the power of predictive analytics is improving the business processes and presenting new ways of operating. The field of affordable housing development, however, has often lacked investment and seen delays in adopting new technology and data science. With the growing need for housing, every city needs combined efforts from both public and private sectors, as well as a stronger knowledge base of the demands and experiences of people needing these spaces. Data science can provide insights into the needs for affordable housing and enhance efficiencies in development to help get those homes built, leased, or even sold in a new way. This research provides a tool-kit for modern-day real estate professionals in identifying appropriate data to make better-informed decisions in the real estate development process. From public city data to privately gathered data, there is a vast amount of information and numerous sources available in the industry. This research aims to compile a database of data sources, analyze the development process to understand the key metrics for stakeholders to enable decisions and map those sources to each phase or questions that need to be answered to make an optimal development decision. This research reviews the developer's perspective of data science and provides a direction that can be used to orient themselves during the initial phase to incorporate a data-driven strategy into their affordable multi-family housing.
by Sun Jung Park.
S.M. in Real Estate Development
S.M.inRealEstateDevelopment Massachusetts Institute of Technology, Program in Real Estate Development in conjunction with the Center for Real Estate

Style APA, Harvard, Vancouver, ISO itp.

Rozprawy doktorskie na temat „Read data”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych