Dissertations / Theses: 'Data Format'

1

Mills, H. L., and K. D. Turver. "24-BIT FLIGHT TEST DATA RECORDING FORMAT." International Foundation for Telemetering, 1991. http://hdl.handle.net/10150/612937.

Full text

Abstract:

International Telemetering Conference Proceedings / November 04-07, 1991 / Riviera Hotel and Convention Center, Las Vegas, Nevada
Boeing Commercial Airplane Group’s Flight Test Engineering organization is developing a new test data recording format to be used on the new model 777 airplane. ARINC 429, ARINC 629 and IRIG PCM data will be formatted for recording test data. The need to support a variety of data recorders, and three types of data, mandate the development of a new recording format. The format Flight Test chose is a variation of IRIG Standard 106-86, Chapter 8. The data from each channel is treated as a data packet, including time and channel ID, and then multiplexed into 24 bits. This allows a time accuracy of 10 microseconds and a minimum latency caused by multiplexing.

APA, Harvard, Vancouver, ISO, and other styles

2

Meyer, David, Friedrich Leisch, Torsten Hothorn, and Kurt Hornik. "StatDataML. An XML format for statistical data." SFB Adaptive Information Systems and Modelling in Economics and Management Science, WU Vienna University of Economics and Business, 2002. http://epub.wu.ac.at/540/1/document.pdf.

Full text

Abstract:

In order to circumvent common difficulties in exchanging statistical data between heterogeneous applications (format incompatibilities, technocentric data representation), we introduce an XML-based markup language for statistical data, called StatDataML. After comparing StatDataML to other data concepts, we detail the design which borrows from the language S, such that data objects are basically organized as recursive and non-recursive structures, and may also be supplemented with meta-information.
Series: Report Series SFB "Adaptive Information Systems and Modelling in Economics and Management Science"

APA, Harvard, Vancouver, ISO, and other styles

3

Ilg, Markus. "Digital processing of map data in raster format /." Zürich : Geographisches Institut Eidgenössische Technische Hochschule, 1986. http://e-collection.ethbib.ethz.ch/show?type=diss&nr=7973.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Kupferschmidt, Benjamin, and Eric Pesciotta. "Automatic Format Generation Techniques for Network Data Acquisition Systems." International Foundation for Telemetering, 2009. http://hdl.handle.net/10150/606089.

Full text

Abstract:

ITC/USA 2009 Conference Proceedings / The Forty-Fifth Annual International Telemetering Conference and Technical Exhibition / October 26-29, 2009 / Riviera Hotel & Convention Center, Las Vegas, Nevada
Configuring a modern, high-performance data acquisition system is typically a very timeconsuming and complex process. Any enhancement to the data acquisition setup software that can reduce the amount of time needed to configure the system is extremely useful. Automatic format generation is one of the most useful enhancements to a data acquisition setup application. By using Automatic Format Generation, an instrumentation engineer can significantly reduce the amount of time that is spent configuring the system while simultaneously gaining much greater flexibility in creating sampling formats. This paper discusses several techniques that can be used to generate sampling formats automatically while making highly efficient use of the system's bandwidth. This allows the user to obtain most of the benefits of a hand-tuned, manually created format without spending excessive time creating it. One of the primary techniques that this paper discusses is an enhancement to the commonly used power-of-two rule, for selecting sampling rates. This allows the system to create formats that use a wider variety of rates. The system is also able to handle groups of related measurements that must follow each other sequentially in the sampling format. This paper will also cover a packet based formatting scheme that organizes measurements based on common sampling rates. Each packet contains a set of measurements that are sampled at a particular rate. A key benefit of using an automatic format generation system with this format is the optimization of sampling rates that are used to achieve the best possible match for each measurement's desired sampling rate.

APA, Harvard, Vancouver, ISO, and other styles

5

Peart, David E., and Jim Talbert. "CONVERTING ASYNCHRONOUS DATA INTO A STANDARD IRIG TELEMETRY FORMAT." International Foundation for Telemetering, 1997. http://hdl.handle.net/10150/609679.

Full text

Abstract:

International Telemetering Conference Proceedings / October 27-30, 1997 / Riviera Hotel and Convention Center, Las Vegas, Nevada
In recent years we have seen an increase in the use of MIL-STD-1553 buses and other asynchronous data sources used in new missile and launcher designs. The application of multiplexed asynchronous buses in missiles and launchers is very common today. With increasing application of asynchronous data sources into very complex systems the need to acquire, analyze, and present one hundred percent of the bus traffic in real time or near real time has become especially important during testing and diagnostic operations. This paper discusses ways of converting asynchronous data, including MIL-STD-1553, into a telemetry format that is suitable for encryption, telemetering, recording, and presenting with Inter Range Instrumentation Group (IRIG) compatible off-the-shelf hardware. The importance of these designs is to provide the capability to conserve data bandwidth and to maximize the use of existing hardware. In addition, this paper will discuss a unique decode and time tagging design that conserves data storage when compared to the methods in IRIG Standard 106-96 and still maintains a very accurate time tag.

APA, Harvard, Vancouver, ISO, and other styles

6

Graul, Michael, Ronald Fernandes, John L. Hamilton, Charles H. Jones, and Jon Morgan. "ENHANCEMENTS TO THE DATA DISPLAY MARKUP LANGUAGE." International Foundation for Telemetering, 2006. http://hdl.handle.net/10150/604103.

Full text

Abstract:

ITC/USA 2006 Conference Proceedings / The Forty-Second Annual International Telemetering Conference and Technical Exhibition / October 23-26, 2006 / Town and Country Resort & Convention Center, San Diego, California
This paper presents the description of the updated Data Display Markup Language (DDML), a neutral format for data display configurations. The development of DDML is motivated by the fact that in joint service program systems, there is a critical need for common data displays to support distributed T&E missions, irrespective of the test location, data acquisition system, and display system. DDML enables standard data displays to be specified for any given system under test, irrespective of the display vendor or system in which they will be implemented. The version 3.0 of DDML represents a more mature language than the version 1.0 presented at the 2003 ITC. The updated version has been validated for completeness and robustness by developing translators between DDML and numerous vendor formats. The DDML schema has been presented to the Range Commander’s Council (RCC) Data Multiplex Committee for consideration for inclusion in the IRIG 106 standard. The DDML model will be described in terms of both the XML schema and the UML model, and various examples of DDML models will be presented. The intent of this paper is to solicit specific input from the community on this potential RCC standard.

APA, Harvard, Vancouver, ISO, and other styles

7

Wegener, John A., and Rodney L. Davis. "EXTENSION OF A COMMON DATA FORMAT FOR REAL-TIME APPLICATIONS." International Foundation for Telemetering, 2004. http://hdl.handle.net/10150/604961.

Full text

Abstract:

International Telemetering Conference Proceedings / October 18-21, 2004 / Town & Country Resort, San Diego, California
The HDF5 (Hierarchical Data Format) data storage family is an industry standard format that allows data to be stored in a common format and retrieved by a wide range of common tools. HDF5 is a widely accepted industry standard container for data storage developed by the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign. The HDF5 data storage family includes HDF-Time History, intended for data processing, and HDF-Packet, intended for real-time data collection; each of these is an extension to the basic HDF5 format, which defines data structures and associated interrelationships, optimized for that particular purpose. HDF-Time History, developed jointly by Boeing and NCSA, is in the process of being adopted throughout the Boeing test community and by its external partners. The Boeing/NCSA team is currently developing HDF-Packet to support real-time streaming applications, such as airborne data collection and recording of received telemetry. The advantages are significant cost reduction resulting from storing the data in its final format, thus avoiding conversion between a myriad of recording and intermediate formats. In addition, by eliminating intermediate file translations and conversions, data integrity is maintained from recording through processing and archival storage. As well, HDF5 is a general-purpose wrapper, into which can be stored processed data and other data documentation information (such as calibrations), thus making the final data file self-documenting. This paper describes the basics of the HDF-Time History, the extensions required to support real-time acquisition with HDF-Packet, and implementation issues unique to real-time acquisition. It also describes potential future implementations for data acquisition systems in different segments of the test data industry.

APA, Harvard, Vancouver, ISO, and other styles

8

Alfredsson, Anders. "XML as a Format for Representation and Manipulation of Data from Radar Communications." Thesis, University of Skövde, Department of Computer Science, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-591.

Full text

Abstract:

XML was designed to be a new standard for marking up data on the web. However, as a result of its extensible and flexible properties, XML is now being used more and more for other purposes than was originally intended. Today XML is prompting an approach more focused on data exchange, between different applications inside companies or even between cooperating businesses.

Businesses are showing interest in using XML as an integral part of their work. Ericsson Microwave Systems (EMW) is a company that sees XML as a conceivable solution to problems in the work with radar communications. An approach towards a solution based on a relational database system has earlier been analysed.

In this project we present an investigation of the work at EMW, and identification and documentation of the problems in the radar communication work. Also, the requirements and expectations that EMW has on XML are presented. Moreover, an analysis has been made to decide to what extent XML could be used to solve the problems of EMW. The analysis was conducted by elucidating the problems and possibilities of XML compared to the previous approach for solving the problems at EMW, which was based on using a relational database management system.

The analysis shows that XML has good features for representing hierarchically structured data, as in the EMW case. It is also shown that XML is good for data integration purposes. Furthermore, the analysis shows that XML, due to its self-describing and weak typing nature, is inappropriate to use in the data semantics and integrity problem context of EMW. However, it also shows that the new XML Schema standard could be used as a complement to the core XML standard, to partially solve the semantics problems.

APA, Harvard, Vancouver, ISO, and other styles

9

Barnum, Jil. "THE USE OF HDF IN F-22 AVIONICS TEST AND EVALUATION." International Foundation for Telemetering, 1996. http://hdl.handle.net/10150/608388.

Full text

Abstract:

International Telemetering Conference Proceedings / October 28-31, 1996 / Town and Country Hotel and Convention Center, San Diego, California
Hierarchical Data Format (HDF) is a public domain standard for file formats which is documented and maintained by the National Center for Super Computing Applications. HDF is the standard adopted by the F-22 program to increase efficiency of avionics data processing and utility of the data. This paper will discuss how the data processing Integrated Product Team (IPT) on the F-22 program plans to use HDF for file format standardization. The history of the IPT choosing HDF, the efficiencies gained by choosing HDF, and the ease of data transfer will be explained.

APA, Harvard, Vancouver, ISO, and other styles

10

Wan, Wade K. (Wade Keith) 1973. "Adaptive format conversion information as enhancement data for scalable video coding." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/29903.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
Includes bibliographical references (p. 143-145).
Scalable coding techniques can be used to efficiently provide multicast video service and involve transmitting a single independently coded base layer and one or more dependently coded enhancement layers. Clients can decode the base layer bitstream and none, some or all of the enhancement layer bitstreams to obtain video quality commensurate with their available resources. In many scalable coding algorithms, residual coding information is the only type of data that is coded in the enhancement layers. However, since the transmitter has access to the original sequence, it can adaptively select different format conversion methods for different regions in an intelligent manner. This adaptive format conversion information can then be transmitted as enhancement data to assist processing at the decoder. The use of adaptive format conversion has not been studied in detail and this thesis examines when and how it can be used for scalable video compression. A new scalable codec is developed in this thesis that can utilize adaptive format conversion information and/or residual coding information as enhancement data. This codec was used in various simulations to investigate different aspects of adaptive format conversion such as the effect of the base layer, a comparison of adaptive format conversion and residual coding, and the use of both adaptive format conversion and residual coding.
(cont.) The experimental results show adaptive format conversion can provide video scalability at low enhancement bitrates not possible with residual coding and also assist residual coding at higher enhancement layer bitrates. This thesis also discusses the application of adaptive format conversion to the migration path for digital television. Adaptive format conversion is well-suited to the unique problems of the migration path and can provide initial video scalability as well as assist a future migration path.
by Wade K. Wan.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

11

Kupferschmidt, Benjamin, and Albert Berdugo. "DESIGNING AN AUTOMATIC FORMAT GENERATOR FOR A NETWORK DATA ACQUISITION SYSTEM." International Foundation for Telemetering, 2006. http://hdl.handle.net/10150/604157.

Full text

Abstract:

ITC/USA 2006 Conference Proceedings / The Forty-Second Annual International Telemetering Conference and Technical Exhibition / October 23-26, 2006 / Town and Country Resort & Convention Center, San Diego, California
In most current PCM based telemetry systems, an instrumentation engineer manually creates the sampling format. This time consuming and tedious process typically involves manually placing each measurement into the format at the proper sampling rate. The telemetry industry is now moving towards Ethernet-based systems comprised of multiple autonomous data acquisition units, which share a single global time source. The architecture of these network systems greatly simplifies the task of implementing an automatic format generator. Automatic format generation eliminates much of the effort required to create a sampling format because the instrumentation engineer only has to specify the desired sampling rate for each measurement. The system handles the task of organizing the format to comply with the specified sampling rates. This paper examines the issues involved in designing an automatic format generator for a network data acquisition system.

APA, Harvard, Vancouver, ISO, and other styles

12

Rajyalakshmi, P. S., and R. K. Rajangam. "Data Handling System for IRS." International Foundation for Telemetering, 1987. http://hdl.handle.net/10150/615329.

Full text

Abstract:

International Telemetering Conference Proceedings / October 26-29, 1987 / Town and Country Hotel, San Diego, California
The three axis stabilized Indian Remote Sensing Satellite will image the earth from a 904 Km polar - sun synchronous orbit. The payload is a set of CCD cameras which collect data in four bands visible and near infra-red region. This payload data from two cameras, each at 10.4 megabits per sec is transmitted in a balanced QPSK in X Band. The payload data before transmission is formatted by adopting Major and Minor frame synchronizing codes. The formatted two streams of data are differentially encoded to take care of 4-phase ambiguity due to QPSK transmission. This paper describes the design and development aspects related to such a Data Handling System. It also highlights the environmental qualification tests that were carried out to meet the requirement of three years operational life of the satellite.

APA, Harvard, Vancouver, ISO, and other styles

13

Abboud, Fayez. "Utilizing Image-based Formats to Optimize Pattern Data Format and Processing In Mask and Maskless Pattern Generation Lithography." NSUWorks, 2012. http://nsuworks.nova.edu/gscis_etd/73.

Full text

Abstract:

According to Moore's law, the IC (Integrated Circuit) minimum feature size is to shrink node over node, resulting in denser compaction of the design. Such compaction results in more polygons per design. The extension of optical lithography to print features at a fraction of the wavelength is only possible with the use of optical tricks, like RET (Resolution Enhancement Techniques) and ILT (Inverse Lithography Technology), to account for systematic corrections needed between the mask and the wafer exposure. Such optical tricks add extensive decorations and edge jogs to the primary features, creating even larger increases in the number of polygons per design. As the pattern file size increases, processing time and complexity becomes directly proportional to the number of polygons; such increase is now becoming one of the key obstacles in the data processing flow. Polygon-based or vector-based pattern file format has been extended for the past forty years, and now its applicability to modern designs and trends is in question. Current polygon based data flow for IC pattern processing is cumbersome, inefficient, and prone to rounding and truncation errors. The original design starts with pixelated images with maximum edge definition accuracy. The curvilinear shapes are then fitted into polygons to comply with industry standard formats, thus losing edge definition accuracy. The polygons are then converted to raster images to approximate the original intended data. This dissertation builds on the modern advancements in digital image and video processing to allow for a new image-based format, Sequential-Pixel-Frame, specifically for integrated circuit pattern representation. Unlike standard lossy compressed video, the new format contains all the information and accuracy intended for mask making and direct write. The new format is defined to replace the old historical polygon-based formats. In addition, the dissertation proposes a more efficient data flow from tape-out to mask making. The key advantages of the new format are a smaller file size and a reduced processing time for the more complex patterns intended for advanced technology nodes. However, the new format did not offer such advantages for the older technology nodes. This is in line with the goals and expectations of the research.

APA, Harvard, Vancouver, ISO, and other styles

14

Seegmiller, Ray D., Greg C. Willden, Maria S. Araujo, Todd A. Newton, Ben A. Abbott, and William A. Malatesta. "Automation of Generalized Measurement Extraction from Telemetric Network Systems." International Foundation for Telemetering, 2012. http://hdl.handle.net/10150/581647.

Full text

Abstract:

ITC/USA 2012 Conference Proceedings / The Forty-Eighth Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2012 / Town and Country Resort & Convention Center, San Diego, California
In telemetric network systems, data extraction is often an after-thought. The data description frequently changes throughout the program so that last minute modifications of the data extraction approach are often required. This paper presents an alternative approach in which automation of measurement extraction is supported. The central key is a formal declarative language that can be used to configure instrumentation devices as well as measurement extraction devices. The Metadata Description Language (MDL) defined by the integrated Network Enhanced Telemetry (iNET) program, augmented with a generalized measurement extraction approach, addresses this issue. This paper describes the TmNS Data Extractor Tool, as well as lessons learned from commercial systems, the iNET program and TMATS.

APA, Harvard, Vancouver, ISO, and other styles

15

Manning, Dennis, Rick Williams, and Paul Ferrill. "Data Filtering Unit (DFU): Dealing With Cryptovariable Keys in Data Recorded Using the IRIG 106 Chapter 10 Format." International Foundation for Telemetering, 2006. http://hdl.handle.net/10150/604140.

Full text

Abstract:

ITC/USA 2006 Conference Proceedings / The Forty-Second Annual International Telemetering Conference and Technical Exhibition / October 23-26, 2006 / Town and Country Resort & Convention Center, San Diego, California
Recent advancements in IRIG 106 Chapter 10 recording systems allow the recording of all on board 1553 bus and PCM traffic to a single media. These advancements have also brought about the issue of extracting data with different levels of classification that was written to single location. Carrying GPS “smart” weapons further complicates this issue since the recording of GPS keys adds another level of classification to the mix. The ability to separate and/or remove higher level data from a data product is now required. This paper describes the design of a hardware device that will filter specified data from IRIG 106 Chapter 10 recorder memory modules (RMMs) to prevent the storage device or computer from becoming classified at the level of the specified data.

APA, Harvard, Vancouver, ISO, and other styles

16

Munir, Rana Faisal. "Storage format selection and optimization for materialized intermediate results in data-intensive flows." Doctoral thesis, Universitat Politècnica de Catalunya, 2019. http://hdl.handle.net/10803/668476.

Full text

Abstract:

Modern organizations produce and collect large volumes of data, that need to be processed repeatedly and quickly for gaining business insights. For such processing, typically, Data-intensive Flows (DIFs) are deployed on distributed processing frameworks. The DIFs of different users have many computation overlaps (i.e., parts of the processing are duplicated), thus wasting computational resources and increasing the overall cost. The output of these computation overlaps (known as intermediate results) can be materialized for reuse, which helps in reducing the cost and saves computational resources if properly done. Furthermore, the way such outputs are materialized must be considered, as different storage layouts (i.e., horizontal, vertical, and hybrid) can be used to reduce the I/O cost. In this PhD work, we first propose a novel approach for automatically materializing the intermediate results of DIFs through a multi-objective optimization method, which can tackle multiple and conflicting quality metrics. Next, we study the behavior of different operators of DIFs that are the first to process the loaded materialized results. Based on this study, we devise a rule-based approach, that decides the storage layout for materialized results based on the subsequent operation types. Despite improving the cost in general, the heuristic rules do not consider the amount of data read while making the choice, which could lead to a wrong decision. Thus, we design a cost model that is capable of finding the right storage layout for every scenario. The cost model uses data and workload characteristics to estimate the I/O cost of a materialized intermediate results with different storage layouts and chooses the one which has minimum cost. The results show that storage layouts help to reduce the loading time of materialized results and overall, they improve the performance of DIFs. The thesis also focuses on the optimization of the configurable parameters of hybrid layouts. We propose ATUN-HL (Auto TUNing Hybrid Layouts), which based on the same cost model and given the workload and characteristics of data, finds the optimal values for configurable parameters in hybrid layouts (i.e., Parquet). Finally, the thesis also studies the impact of parallelism in DIFs and hybrid layouts. Our proposed cost model helps to devise an approach for fine-tuning the parallelism by deciding the number of tasks and machines to process the data. Thus, the cost model proposed in this thesis, enables in choosing the best possible storage layout for materialized intermediate results, tuning the configurable parameters of hybrid layouts, and estimating the number of tasks and machines for the execution of DIFs.
Las organizaciones producen y recopilan grandes volúmenes de datos, que deben procesarse de forma repetitiva y rápida para obtener información relevante para la empresa. Para tal procesamiento, por lo general, se emplean flujos intensivos de datos (DIFs por sussiglas en inglés) en entornos de procesamiento distribuido. Los DIFs de diferentes usuarios tienen elementos comunes (es decir, se duplican partes del procesamiento, lo que desperdicia recursos computacionales y aumenta el coste en general). Los resultados intermedios de varios DIFs pueden pues coincidir y se pueden por tanto materializar para facilitar su reutilización, lo que ayuda a reducir el coste y ahorrar recursos si se realiza correctamente. Además, la forma en qué se materializan dichos resultados debe ser considerada. Por ejemplo, diferentes tipos de diseño lógico de los datos (es decir, horizontal, vertical o híbrido) se pueden utilizar para reducir el coste de E/S. En esta tesis doctoral, primero proponemos un enfoque novedoso para materializar automáticamente los resultados intermedios de los DIFs a través de un método de optimización multi-objetivo, que puede considerar múltiples y contradictorias métricas de calidad. A continuación, estudiamos el comportamiento de diferentes operadores de DIF que acceden directamente a los resultados materializados. Sobre la base de este estudio, ideamos un enfoque basado en reglas, que decide el diseño del almacenamiento para los resultados materializados en función de los tipos de operaciones que los utilizan directamente. A pesar de mejorar el coste en general, las reglas heurísticas no consideran estadísticas sobre la cantidad de datos leídos al hacer la elección, lo que podría llevar a una decisión errónea. Consecuentemente, diseñamos un modelo de costos que es capaz de encontrar el diseño de almacenamiento adecuado para cada escenario dependiendo de las características de los datos almacenados. El modelo de costes usa estadísticas y características de acceso para estimar el coste de E/S de un resultado intervii medio materializado con diferentes diseños de almacenamiento y elige el de menor coste. Los resultados muestran que los diseños de almacenamiento ayudan a reducir el tiempo de carga de los resultados materializados y, en general, mejoran el rendimiento de los DIF. La tesis también presta atención a la optimización de los parámetros configurables de diseños híbridos. Proponemos así ATUN-HL (Auto TUNing Hybrid Layouts), que, basado en el mismo modelo de costes, las características de los datos y el tipo de acceso que se está haciendo, encuentra los valores óptimos para los parámetros de configuración en disponibles Parquet (una implementación de diseños híbridos para Hadoop Distributed File System). Finalmente, esta tesis estudia el impacto del paralelismo en DIF y diseños híbridos. El modelo de coste propuesto ayuda a idear un enfoque para ajustar el paralelismo al decidir la cantidad de tareas y máquinas para procesar los datos. En resumen, el modelo de costes propuesto permite elegir el mejor diseño de almacenamiento posible para los resultados intermedios materializados, ajustar los parámetros configurables de diseños híbridos y estimar el número de tareas y máquinas para la ejecución de DIF.
Moderne Unternehmen produzieren und sammeln große Datenmengen, die wiederholt und schnell verarbeitet werden müssen, um geschäftliche Erkenntnisse zu gewinnen. Für die Verarbeitung dieser Daten werden typischerweise Datenintensive Prozesse (DIFs) auf verteilten Systemen wie z.B. MapReduce bereitgestellt. Dabei ist festzustellen, dass die DIFs verschiedener Nutzer sich in großen Teilen überschneiden, wodurch viel Arbeit mehrfach geleistet, Ressourcen verschwendet und damit die Gesamtkosten erhöht werden. Um diesen Effekt entgegenzuwirken, können die Zwischenergebnisse der DIFs für spätere Wiederverwendungen materialisiert werden. Hierbei müssen vor allem die unterschiedlichen Speicherlayouts (horizontal, vertikal und hybrid) berücksichtigt werden. In dieser Doktorarbeit wird ein neuartiger Ansatz zur automatischen Materialisierung der Zwischenergebnisse von DIFs durch eine mehrkriterielle Optimierungsmethode vorgeschlagen, der in der Lage ist widersprüchliche Qualitätsmetriken zu behandeln. Des Weiteren wird untersucht die Wechselwirkung zwischen verschiedenen Operatortypen und unterschiedlichen Speicherlayouts untersucht. Basierend auf dieser Untersuchung wird ein regelbasierter Ansatz vorgeschlagen, der das Speicherlayout für materialisierte Ergebnisse, basierend auf den nachfolgenden Operationstypen, festlegt. Obwohl sich die Gesamtkosten für die Ausführung der DIFs im Allgemeinen verbessern, ist der heuristische Ansatz nicht in der Lage die gelesene Datenmenge bei der Auswahl des Speicherlayouts zu berücksichtigen. Dies kann in einigen Fällen zu falschen Entscheidung führen. Aus diesem Grund wird ein Kostenmodell entwickelt, mit dem für jedes Szenario das richtige Speicherlayout gefunden werden kann. Das Kostenmodell schätzt anhand von Daten und Auslastungsmerkmalen die E/A-Kosten eines materialisierten Zwischenergebnisses mit unterschiedlichen Speicherlayouts und wählt das kostenminimale aus. Die Ergebnisse zeigen, dass Speicherlayouts die Ladezeit materialisierter Ergebnisse verkürzen und insgesamt die Leistung von DIFs verbessern. Die Arbeit befasst sich auch mit der Optimierung der konfigurierbaren Parameter von hybriden Layouts. Konkret wird der sogenannte ATUN-HLAnsatz (Auto TUNing Hybrid Layouts) entwickelt, der auf der Grundlage des gleichen Kostenmodells und unter Berücksichtigung der Auslastung und der Merkmale der Daten die optimalen Werte für konfigurierbare Parameter in Parquet, d.h. eine Implementierung von hybrider Layouts. Schließlich werden in dieser Arbeit auch die Auswirkungen von Parallelität in DIFs und hybriden Layouts untersucht. Dazu wird ein Ansatz entwickelt, der in der Lage ist die Anzahl der Aufgaben und dafür notwendigen Maschinen automatisch zu bestimmen. Zusammengefasst lässt sich festhalten, dass das in dieser Arbeit vorgeschlagene Kostenmodell es ermöglicht, das bestmögliche Speicherlayout für materialisierte Zwischenergebnisse zu ermitteln, die konfigurierbaren Parameter hybrider Layouts festzulegen und die Anzahl der Aufgaben und Maschinen für die Ausführung von DIFs zu schätzen

APA, Harvard, Vancouver, ISO, and other styles

17

Meric, Burak, Michael Graul, Ronald Fernandes, and Charles H. Jones. "DESIGN OF AN INTERLINGUA FOR DATA DISPLAY SYSTEMS." International Foundation for Telemetering, 2003. http://hdl.handle.net/10150/605580.

Full text

Abstract:

International Telemetering Conference Proceedings / October 20-23, 2003 / Riviera Hotel and Convention Center, Las Vegas, Nevada
This paper presents the description of a new XML-based data display language called Data Display Markup Language (DDML) that can be used as an interlingua for different data display configuration formats. Translation of data display configuration between various vendor- formats can be accomplished by translating in and out of DDML. The DDML can also be used as a vendor-neutral format for archiving and retrieving display configurations in a test and evaluation (T&E) configuration repository.

APA, Harvard, Vancouver, ISO, and other styles

18

Leopold, Henrik, der Aa Han van, Fabian Pittke, Manuel Raffel, Jan Mendling, and Hajo A. Reijers. "Searching textual and model-based process descriptions based on a unified data format." Springer Berlin Heidelberg, 2019. http://dx.doi.org/10.1007/s10270-017-0649-y.

Full text

Abstract:

Documenting business processes using process models is common practice in many organizations. However, not all process information is best captured in process models. Hence, many organizations complement these models with textual descriptions that specify additional details. The problem with this supplementary use of textual descriptions is that existing techniques for automatically searching process repositories are limited to process models. They are not capable of taking the information from textual descriptions into account and, therefore, provide incomplete search results. In this paper, we address this problem and propose a technique that is capable of searching textual as well as model-based process descriptions. It automatically extracts activity-related and behavioral information from both descriptions types and stores it in a unified data format. An evaluation with a large Austrian bank demonstrates that the additional consideration of textual descriptions allows us to identify more relevant processes from a repository.

APA, Harvard, Vancouver, ISO, and other styles

19

Jeong, Ki Tai. "A Common Representation Format for Multimedia Documents." Thesis, University of North Texas, 2002. https://digital.library.unt.edu/ark:/67531/metadc3336/.

Full text

Abstract:

Multimedia documents are composed of multiple file format combinations, such as image and text, image and sound, or image, text and sound. The type of multimedia document determines the form of analysis for knowledge architecture design and retrieval methods. Over the last few decades, theories of text analysis have been proposed and applied effectively. In recent years, theories of image and sound analysis have been proposed to work with text retrieval systems and progressed quickly due in part to rapid progress in computer processing speed. Retrieval of multimedia documents formerly was divided into the categories of image and text, and image and sound. While standard retrieval process begins from text only, methods are developing that allow the retrieval process to be accomplished simultaneously using text and image. Although image processing for feature extraction and text processing for term extractions are well understood, there are no prior methods that can combine these two features into a single data structure. This dissertation will introduce a common representation format for multimedia documents (CRFMD) composed of both images and text. For image and text analysis, two techniques are used: the Lorenz Information Measurement and the Word Code. A new process named Jeong's Transform is demonstrated for extraction of text and image features, combining the two previous measurements to form a single data structure. Finally, this single data measurements to form a single data structure. Finally, this single data structure is analyzed by using multi-dimensional scaling. This allows multimedia objects to be represented on a two-dimensional graph as vectors. The distance between vectors represents the magnitude of the difference between multimedia documents. This study shows that image classification on a given test set is dramatically improved when text features are encoded together with image features. This effect appears to hold true even when the available text is diffused and is not uniform with the image features. This retrieval system works by representing a multimedia document as a single data structure. CRFMD is applicable to other areas of multimedia document retrieval and processing, such as medical image retrieval, World Wide Web searching, and museum collection retrieval.

APA, Harvard, Vancouver, ISO, and other styles

20

Boman, Maria. "XML/EDI - EDI med XML-format?" Thesis, University of Skövde, Department of Computer Science, 2000. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-422.

Full text

Abstract:

Electronic data interchange (EDI) är en teknik vilken används för att utbyta elektroniska dokument mellan köpare och säljare. Dessa dokument överförs på en strikt elektronisk standardiserad väg, dokumenten kan vara fakturor, beställningsformulär, leveransplaner etc. EDI har funnits i över 25år och är vida spridd över världen. För att använda EDI krävs det stora resurser i form av kapital och kunskap. Små och medelstora företag har ofta inte de resurser som krävs för att kunna använda EDI.

Nya tekniker utvecklas ständigt. Extensibel Markup Language (XML) är en ny teknik vilken är avsedd att användas främst på Internet. XML delar upp dokumenten med hjälp av etiketter vilket hjälper läsaren att identifiera innehållet i dokumenten. XML har väckt EDI-samhällets intresse då XML anses ha kapaciteten att kunna användas som ett EDI-format samtidigt som XML är Internetanpassat vilket betyder att EDI enklare skulle kunna användas på Internet.

I mitt examensarbete har jag utrett om EDI-användande företag avser använda XML/EDI. Jag har genomfört intervjuer för att besvara problemformuleringen. Resultaten jag kommit fram till är att företagen avser använda XML/EDI om en ordentlig standard blir framtagen. Företagen är dessutom positivt inställda till XML/EDI samt Internetbaserad EDI. Anledningen till den positiva inställningen är främst att XML/EDI samt Internetbaserad EDI skulle innebära lägre kostnader vilket i sin tur skulle innebära att även små och medelstora företag skulle kunna använda tekniken.

APA, Harvard, Vancouver, ISO, and other styles

21

Harward, Gregory Brent. "Suitability of the NIST Shop Data Model as a Neutral File Format for Simulation." Diss., CLICK HERE for online access, 2005. http://contentdm.lib.byu.edu/ETD/image/etd899.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Thornbrue, James R. (James Raymond) 1976. "Adaptive format conversion information as enhancement data for the high-definition television migration path." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/29618.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2003.
Includes bibliographical references (p. 115-116).
Prior research indicates that a scalable video codec based on adaptive format conversion (AFC) information may be ideally suited to meet the demands of the migration path for high-definition television. Most scalable coding schemes use a single format conversion technique and encode residual information in the enhancement layer. Adaptive format conversion is different in that it employs more than one conversion technique. AFC partitions a video sequence into small blocks and selects the format conversion filter with the best performance in each block. Research shows that the bandwidth required for this type of enhancement information is small, yet the improvement in video quality is significant. This thesis focuses on the migration from 10801 to 1080P using adaptive deinterlacing. Two main questions are answered. First, how does adaptive format conversion perform when the base layer video is compressed in a manner typical to high-definition television? It was found that when the interlaced base layer was compressed to 0.3 bpp, the mean base layer PSNR was 32 dB and the PSNR improvement due to the enhancement layer was as high as 4 dB. Second, what is the optimal tradeoff between base layer and enhancement layer bandwidth? With the total bandwidth fixed at 0.32 bpp, it was found that the optimal bandwidth allocation was about 96% base layer, 4% enhancement layer using fixed, 16x16 pixel partitions. The base and enhancement layer video at this point were compared to 100% base layer allocation and the best nonadaptive format conversion. While there was usually no visible difference in base layer quality, the adaptively deinterlaced enhancement layer was generally sharper, with cleaner edges, less flickering, and fewer aliasing artifacts than the best nonadaptive method. Although further research is needed, the results of these experiments support the idea of using adaptive deinterlacing in the HDTV migration path.
by James R. Thornbrue.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

23

Miller, Helen Buchanan. "The effect of graphic format, age, and gender on the interpretation of quantitative data." Diss., Virginia Polytechnic Institute and State University, 1989. http://hdl.handle.net/10919/54245.

Full text

Abstract:

The purpose of this study was to investigate the interpretation of numerical data when presented in four different graphic formats to different age groups and sexes. Fifth and sixth grade students (N=129) and eleventh and twelfth grade students (N=129) were assigned to four treatment groups. Each group viewed a different treatment slide with the same data displayed in one of four formats: table, line, Iine·tabIe, or bar. After a narrative introduction, the students, while viewing the treatment graph, were asked to answer three types of questions: specific amount, static, and dynamic comparison. The students were then asked to continue viewing the graph for one full minute. After the minute elapsed, the projector was turned off and the students were asked to answer questions concerning the data presented on the graph. A 4 (Graph Type) X 2 (Age) X 2 (Gender) multivariate analysis of l variance (MANOVA) with repeated measures for the four types of questions was implemented to determine the relations among graph type, age, gender, and four types of questions. The independent variables were type of graph (between), age (between), gender (between), and type of question (within). The dependent variable was the interpretation of quantitative information as measured by the test questions. The findings indicated that graphic format, age, and gender did affect the ability to interpret numerical data. The analysis demonstrated several statistically significant interaction effects: age and type of questions, graph and type of questions, and graph, age and type of questions. High-school students scored higher than elementary-school children on all four questions. Table graphs were effective for answering amount and static questions. As the questions became more complex, such as in a dynamic question, the table graph was one of the least effective means of graphic communication. For recall, the line-table format and line format were the most effective graphs. Age and gender differences emerged for particular graphs. Findings were discussed with regard to cognitive development implications.
Ed. D.

APA, Harvard, Vancouver, ISO, and other styles

24

Fast, Tobias. "Alpha Tested Geometry in DXR : Performance Analysis of Asset Data Variations." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-19730.

Full text

Abstract:

Background. Ray tracing can be used to achieve hyper-realistic 3D rendering but it is a computationally heavy task. Since hardware support for real-time ray tracing was released, the game industry has been introducing this feature into games. However, even modern hardware still experience performance issues when implementing common rendering techniques with ray tracing. One of these problematic techniques is alpha testing. Objectives. The thesis will investigate the following: 1) How the texture format of the alpha map and the number of alpha maps affect the rendering times. 2) How tessellation of the alpha tested geometry affects the performance and if tessellation has the potential to fully replace the alpha test from a performance perspective. Methods. A DXR 3D renderer will be implemented capable of rendering alpha tested geometry using an any-hit shader. The renderer was used to conduct a computational performance benchmark of the rendering times while varying texture and geometry data. Two alpha tested tree models were tessellated to various levels and their related textures were converted into multiple formats that could be used for the test scenes. Results & Conclusions. When the texture formats BC7, R(1xfloat32), and BC4 were used for the alpha map, the rendering times decreased in all cases, relative RGBA(4xfloat32). BC4 showed to give the best performance gain, decreasing the rendering times with up to 17% using one alpha map per model and up to 43% using eight alpha maps. When increasing the number of alpha maps used per model the rendering times increased with up to 52% when going from one alpha map to two. A large increase in rendering times was observed when going from three to four alpha maps in all cases. Using alpha testing on the tessellated model versions increased the rendering times in most cases, at most 135%. A decrease of up to 8% was however observed when the models were tessellated a certain amount. Turning off alpha testing gave a significant decrease in rendering allowing higher tessellated versions to be rendered for all models. In one case, while increasing the number of triangles with a factor of 78 the rendering times were still decreased by 30% relative to the original alpha test implementation. This suggests that pre-tessellated models could potentially be used to replace alpha tessellated geometry when performance is highly required.
Bakgrund. Strålspårning(Ray tracing) kan användas för att uppnå hyperrealistisk 3D-rendering, men det är en mycket tung beräkningsuppgift. Sedan hårdvarustöd för att utföra strålspårning i realtid lanserades har spelindustrin introducerat funktionen i spel. Trots modern hårdvara upplevers fortfarande prestandaproblem när vanliga renderingstekniker kombineras med strålspårning. En av dessa problematiska tekniker är alfa-testning(alpha testing). Syfte. Denna avhandling kommer att undersöka följande: 1) Hur texturformatet på alfamasken(alpha map) och hur antalet alfamaskar påverkar renderingstiderna. 2) På vilket sätt tesselering av den alfa-testade geometrin påverkar prestandan och om tesselering har potentialen att ersätta alfa-testet helt ur ett prestandaperspektiv. Metod. En DXR 3D-renderare kommer att implementeras som kan rendera alfatestad geometri med hjälp av en “Any-hit” shader. Renderaren användes för att mäta och jämföra renderingstider givet varierande textur- och geometri-data. Två alfaprövade trädmodeller tesselaterades till olika nivåer och deras relaterade texturer konverterades till fyra format som användes i testscenerna. Resultat & Slutsatser. När texturformaten BC7, R(1xfloat32) och BC4 användes för alfamasken visade alla en minskad renderingstid relativ RGBA (4xfloat32). BC4 gav bästa prestandaökningen och minskade renderingstiden med upp till 17% med en alfamask per modell och upp till 43% med åtta alfamasker. När antalet alfamasker som användes per modell ökade renderingstiderna med upp till 52% när alfamaskerna ökade från en till två. En stor ökning av renderingstiden observerades när alfamaskerna gick från tre till fyra i alla testfall. När alfatestning användes på de tesselerade modellversionerna ökade renderingstiderna i de flesta fall, som högst 135%. En minskning på upp till 8% observerades emellertid när modellerna tesselaterades till en viss grad. Att stänga av alfatestning gav en signifikant ökning av prestandan, vilket tillät högre tesselerade versioner att renderas för alla modeller. Samtidigt som antalet trianglar ökade med en faktor på 78, i ett av fallen, minskades renderingstiden med 30%. Detta antyder att förtesselerade modeller potentiellt kan användas för att ersätta alfatestad geometri när prestanda är ett högt krav.

APA, Harvard, Vancouver, ISO, and other styles

25

Hodges, Glenn A. "Designing a common interchange format for unit data using the Command and Control information exchange data model (C2IEDM) and XSLT." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2004. http://library.nps.navy.mil/uhtbin/hyperion/04Sep%5FHodges.pdf.

Full text

Abstract:

Thesis (M.S. in Modeling Virtual Environments and Simulation (MOVES))--Naval Postgraduate School, Sept. 2004.
Thesis advisor(s): Curtis Blais, Don Brutzman. Includes bibliographical references (p. 95-98). Also available online.

APA, Harvard, Vancouver, ISO, and other styles

26

Toby, Inimary T., Mikhail K. Levin, Edward A. Salinas, Scott Christley, Sanchita Bhattacharya, Felix Breden, Adam Buntzman, et al. "VDJML: a file format with tools for capturing the results of inferring immune receptor rearrangements." BIOMED CENTRAL LTD, 2016. http://hdl.handle.net/10150/624652.

Full text

Abstract:

Background: The genes that produce antibodies and the immune receptors expressed on lymphocytes are not germline encoded; rather, they are somatically generated in each developing lymphocyte by a process called V(D) J recombination, which assembles specific, independent gene segments into mature composite genes. The full set of composite genes in an individual at a single point in time is referred to as the immune repertoire. V(D) J recombination is the distinguishing feature of adaptive immunity and enables effective immune responses against an essentially infinite array of antigens. Characterization of immune repertoires is critical in both basic research and clinical contexts. Recent technological advances in repertoire profiling via high-throughput sequencing have resulted in an explosion of research activity in the field. This has been accompanied by a proliferation of software tools for analysis of repertoire sequencing data. Despite the widespread use of immune repertoire profiling and analysis software, there is currently no standardized format for output files from V(D) J analysis. Researchers utilize software such as IgBLAST and IMGT/High V-QUEST to perform V(D) J analysis and infer the structure of germline rearrangements. However, each of these software tools produces results in a different file format, and can annotate the same result using different labels. These differences make it challenging for users to perform additional downstream analyses. Results: To help address this problem, we propose a standardized file format for representing V(D) J analysis results. The proposed format, VDJML, provides a common standardized format for different V(D) J analysis applications to facilitate downstream processing of the results in an application-agnostic manner. The VDJML file format specification is accompanied by a support library, written in C++ and Python, for reading and writing the VDJML file format. Conclusions: The VDJML suite will allow users to streamline their V(D) J analysis and facilitate the sharing of scientific knowledge within the community. The VDJML suite and documentation are available from https:// vdjserver. org/ vdjml/. We welcome participation from the community in developing the file format standard, as well as code contributions.

APA, Harvard, Vancouver, ISO, and other styles

27

Berdugo, Albert, and Martin Small. "HIGH SPEED ASYNCHRONOUS DATA MULTIPLEXER/ DEMULTIPLEXER FOR HIGH DENSITY DIGITAL RECORDERS." International Foundation for Telemetering, 1996. http://hdl.handle.net/10150/608366.

Full text

Abstract:

International Telemetering Conference Proceedings / October 28-31, 1996 / Town and Country Hotel and Convention Center, San Diego, California
Modern High Density Digital Recorders (HDDR) are ideal devices for the storage of large amounts of digital and/or wideband analog data. Ruggedized versions of these recorders are currently available and are supporting many military and commercial flight test applications. However, in certain cases, the storage format becomes very critical, e.g., when a large number of data types are involved, or when channel-to-channel correlation is critical, or when the original data source must be accurately recreated during post mission analysis. A properly designed storage format will not only preserve data quality, but will yield the maximum storage capacity and record time for any given recorder family or data type. This paper describes a multiplex/demultiplex technique that formats multiple high speed data sources into a single, common format for recording. The method is compatible with many popular commercial recorder standards such as DCRsi, VLDS, and DLT. Types of input data typically include PCM, wideband analog data, video, aircraft data buses, avionics, voice, time code, and many others. The described method preserves tight data correlation with minimal data overhead. The described technique supports full reconstruction of the original input signals during data playback. Output data correlation across channels is preserved for all types of data inputs. Simultaneous real-time data recording and reconstruction are also supported.

APA, Harvard, Vancouver, ISO, and other styles

28

Gregory, Richard Cedric Thomas Art College of Fine Arts UNSW. "A graphic investigation of the atlas as a narrative format for the visual communication of cultural and social data." Awarded by:University of New South Wales. Art, 2009. http://handle.unsw.edu.au/1959.4/43798.

Full text

Abstract:

Maps and atlases are traditionally convenient documents for representing the surface of the earth. They provide an impression of spatial relationships and facilitate an appreciation of geographical and environmental characteristics. They are essential tools for creating an awareness of the world beyond the limits of our experience. Maps can also inform readers on the flow of cultural or economic influences, because they show localities in relation to their neighbours. Furthermore, they capture the reader's imagination by provoking the desire for adventure and exploration. Occasionally maps are also censored because they are an efficient means of indicating strategic features. This project concerns the historical and contemporary examples of communicating information visually by analysing a selection of conventional literary and visual sources, which informs the research. It includes graphic forms that present abundant data, for example, atlases and texts on the architectural history of Central Asia, Tibet, China and Japan. The studio works will examine illustration, draughtsmanship, rendering, and textual/visual imagery. The outcome will be an illustrated atlas of traditional architecture in the earthquake zones of Central Asia (Xinjiang), Tibet, China, Japan and related areas. The graphic format is used as a narrative for the communication of environmental, cultural and architectural data of the region. The atlas is also intended to present the subject in a holistic form in relation to environmental influences on the structures and materiality of buildings, and the broader field of history.

APA, Harvard, Vancouver, ISO, and other styles

29

Roguski, Łukasz 1987. "High-throughput sequencing data compression." Doctoral thesis, Universitat Pompeu Fabra, 2017. http://hdl.handle.net/10803/565775.

Full text

Abstract:

Thanks to advances in sequencing technologies, biomedical research has experienced a revolution over recent years, resulting in an explosion in the amount of genomic data being generated worldwide. The typical space requirement for storing sequencing data produced by a medium-scale experiment lies in the range of tens to hundreds of gigabytes, with multiple files in different formats being produced by each experiment. The current de facto standard file formats used to represent genomic data are text-based. For practical reasons, these are stored in compressed form. In most cases, such storage methods rely on general-purpose text compressors, such as gzip. Unfortunately, however, these methods are unable to exploit the information models specific to sequencing data, and as a result they usually provide limited functionality and insufficient savings in storage space. This explains why relatively basic operations such as processing, storage, and transfer of genomic data have become a typical bottleneck of current analysis setups. Therefore, this thesis focuses on methods to efficiently store and compress the data generated from sequencing experiments. First, we propose a novel general purpose FASTQ files compressor. Compared to gzip, it achieves a significant reduction in the size of the resulting archive, while also offering high data processing speed. Next, we present compression methods that exploit the high sequence redundancy present in sequencing data. These methods achieve the best compression ratio among current state-of-the-art FASTQ compressors, without using any external reference sequence. We also demonstrate different lossy compression approaches to store auxiliary sequencing data, which allow for further reductions in size. Finally, we propose a flexible framework and data format, which allows one to semi-automatically generate compression solutions which are not tied to any specific genomic file format. To facilitate data management needed by complex pipelines, multiple genomic datasets having heterogeneous formats can be stored together in configurable containers, with an option to perform custom queries over the stored data. Moreover, we show that simple solutions based on our framework can achieve results comparable to those of state-of-the-art format-specific compressors. Overall, the solutions developed and described in this thesis can easily be incorporated into current pipelines for the analysis of genomic data. Taken together, they provide grounds for the development of integrated approaches towards efficient storage and management of such data.
Gràcies als avenços en el camp de les tecnologies de seqüenciació, en els darrers anys la recerca biomèdica ha viscut una revolució, que ha tingut com un dels resultats l'explosió del volum de dades genòmiques generades arreu del món. La mida típica de les dades de seqüenciació generades en experiments d'escala mitjana acostuma a situar-se en un rang entre deu i cent gigabytes, que s'emmagatzemen en diversos arxius en diferents formats produïts en cada experiment. Els formats estàndards actuals de facto de representació de dades genòmiques són en format textual. Per raons pràctiques, les dades necessiten ser emmagatzemades en format comprimit. En la majoria dels casos, aquests mètodes de compressió es basen en compressors de text de caràcter general, com ara gzip. Amb tot, no permeten explotar els models d'informació especifícs de dades de seqüenciació. És per això que proporcionen funcionalitats limitades i estalvi insuficient d'espai d'emmagatzematge. Això explica per què operacions relativament bàsiques, com ara el processament, l'emmagatzematge i la transferència de dades genòmiques, s'han convertit en un dels principals obstacles de processos actuals d'anàlisi. Per tot això, aquesta tesi se centra en mètodes d'emmagatzematge i compressió eficients de dades generades en experiments de sequenciació. En primer lloc, proposem un compressor innovador d'arxius FASTQ de propòsit general. A diferència de gzip, aquest compressor permet reduir de manera significativa la mida de l'arxiu resultant del procés de compressió. A més a més, aquesta eina permet processar les dades a una velocitat alta. A continuació, presentem mètodes de compressió que fan ús de l'alta redundància de seqüències present en les dades de seqüenciació. Aquests mètodes obtenen la millor ratio de compressió d'entre els compressors FASTQ del marc teòric actual, sense fer ús de cap referència externa. També mostrem aproximacions de compressió amb pèrdua per emmagatzemar dades de seqüenciació auxiliars, que permeten reduir encara més la mida de les dades. En últim lloc, aportem un sistema flexible de compressió i un format de dades. Aquest sistema fa possible generar de manera semi-automàtica solucions de compressió que no estan lligades a cap mena de format específic d'arxius de dades genòmiques. Per tal de facilitar la gestió complexa de dades, diversos conjunts de dades amb formats heterogenis poden ser emmagatzemats en contenidors configurables amb l'opció de dur a terme consultes personalitzades sobre les dades emmagatzemades. A més a més, exposem que les solucions simples basades en el nostre sistema poden obtenir resultats comparables als compressors de format específic de l'estat de l'art. En resum, les solucions desenvolupades i descrites en aquesta tesi poden ser incorporades amb facilitat en processos d'anàlisi de dades genòmiques. Si prenem aquestes solucions conjuntament, aporten una base sòlida per al desenvolupament d'aproximacions completes encaminades a l'emmagatzematge i gestió eficient de dades genòmiques.

APA, Harvard, Vancouver, ISO, and other styles

30

Samaan, Mouna M., and Stephen C. Cook. "Configuration of Flight Test Telemetry Frame Formats." International Foundation for Telemetering, 1995. http://hdl.handle.net/10150/611587.

Full text

Abstract:

International Telemetering Conference Proceedings / October 30-November 02, 1995 / Riviera Hotel, Las Vegas, Nevada
The production of flight test plans have received attention from many research workers due to increasing complexity of testing facilities, the complex demands proposed by customers and the large volume of data required from test flights. The paper opens with a review of research work conducted by other authors who have contributed to ameliorating the preparation of flight test plans and processing the resulting data. This is followed by a description of a specific problem area; efficiently configuring the flight test data telemetry format (defined by the relevant standards while meeting user requirements of sampling rate and PCM word length). Following a description of a current semi-automated system, the authors propose an enhanced approach and demonstrate its efficiency through two case studies.

APA, Harvard, Vancouver, ISO, and other styles

31

Brewer, Peter W., Daniel Murphy, and Esther Jansma. "Tricycle: A Universal Conversion Tool For Digital Tree-Ring Data." Tree-Ring Society, 2011. http://hdl.handle.net/10150/622638.

Full text

Abstract:

There are at least 21 dendro-data formats used in dendrochronology laboratories around the world. Many of these formats are read by a limited number of programs, thereby inhibiting collaboration, limiting critical review of analyses, and risking the long-term accessibility of datasets. Some of the older formats are supported by a single program and are falling into disuse, opening the risk for data to become obsolete and unreadable. These formats also have a variety of flaws, including but not limited to no accurate method for denoting measuring units, little or no metadata support, lack of support for variables other than whole ring widths (e.g. earlywood/latewood widths, ratios and density). The proposed long-term solution is the adoption of a universal data standard such as the Tree-Ring Data Standard (TRiDaS). In the short and medium term, however, a tool is required that is capable of converting not only back and forth to this standard, but between any of the existing formats in use today. Such a tool is also required to provide continued access to data archived in obscure formats. This paper describes TRiCYCLE, a new application that does just this. TRiCYCLE is an open-source, cross-platform, desktop application for the conversion of the most commonly used data formats. Two open source Java libraries upon which TRiCYCLE depends are also described. These libraries can be used by developers to implement support for all data formats within their own applications.

APA, Harvard, Vancouver, ISO, and other styles

32

Revels, Kenneth W. "Constraints of Migrating Transplant Information System's Legacy Data to an XML Format For Medical Applications Use." NSUWorks, 2001. http://nsuworks.nova.edu/gscis_etd/799.

Full text

Abstract:

This dissertation presents the development of two methodologies to migrate legacy data elements to an open environment. Changes in the global economy and the increasingly competitive business climate are driving companies to manage legacy data in new ways. Legacy data is used for strategic decisions as well as short-term decisions. Data migration involves replacing problematic hardware and software. The legacy data elements are being placed into different file formats then migrated to open system environments. The purpose of this study, was to develop migration methodologies to move legacy data to an XML format the techniques used for developing the intermediate delimited file and the XML schema involved the use of system development life cycles (SDLC) procedures. These procedures are part of the overall SDLC methodologies used to guide this project to a successful conclusion. SDLC procedures helped in planning, scheduling, and implementing of the project steps. This study presents development methodologies to create XML schemas which saves man-hours. XML technology is very flexible in that it can be published to many different platforms that are ODBC compliant and uses TCPIIP as its transport protocol. This study provides a methodology that steers the step-by-step migration of legacy information to an open environment. The incremental migration methodology was used to create and migrate the intermediate legacy data elements and the FAST methodology was used to develop the XML schema. As a result the legacy data can reside in a more efficient and useful data processing environment.

APA, Harvard, Vancouver, ISO, and other styles

33

Mehrez, Ichrak. "Auto-tuning pour la détection automatique du meilleur format de compression pour matrice creuse." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLV054/document.

Full text

Abstract:

De nombreuses applications en calcul scientifique traitent des matrices creuses de grande taille ayant des structures régulières ou irrégulières. Pour réduire à la fois la complexité spatiale et la complexité temporelle du traitement, ces matrices nécessitent l'utilisation d'une structure particulière de stockage (ou de compression) des données ainsi que l’utilisation d’architectures cibles parallèles/distribuées. Le choix du format de compression le plus adéquat dépend généralement de plusieurs facteurs dont, en particulier, la structure de la matrice creuse, l'architecture de la plateforme cible et la méthode numérique utilisée. Etant donné la diversité de ces facteurs, un choix optimisé pour un ensemble de données d'entrée peut induire de mauvaises performances pour un autre. D’où l’intérêt d’utiliser un système permettant la sélection automatique du meilleur format de compression (MFC) en prenant en compte ces différents facteurs. C’est dans ce cadre précis que s’inscrit cette thèse. Nous détaillons notre approche en présentant la modélisation d'un système automatique qui, étant donnée une matrice creuse, une méthode numérique, un modèle de programmation parallèle et une architecture, permet de sélectionner automatiquement le MFC. Dans une première étape, nous validons notre modélisation par une étude de cas impliquant (i) Horner, et par la suite le produit matrice-vecteur creux (PMVC), comme méthodes numériques, (ii) CSC, CSR, ELL, et COO comme formats de compression, (iii) le data parallèle comme modèle de programmation et (iv) une plateforme multicœurs comme architecture cible. Cette étude nous permet d’extraire un ensemble de métriques et paramètres qui influent sur la sélection du MFC. Nous démontrons que les métriques extraites de l'analyse du modèle data parallèle ne suffisent pas pour prendre une décision (sélection du MFC). Par conséquent, nous définissons de nouvelles métriques impliquant le nombre d'opérations effectuées par la méthode numérique et le nombre d’accès à la mémoire. Ainsi, nous proposons un processus de décision prenant en compte à la fois l'analyse du modèle data parallèle et l'analyse de l’algorithme. Dans une deuxième étape, et en se basant sur les données que nous avons extrait précédemment, nous utilisons les algorithmes du Machine Learning pour prédire le MFC d’une matrice creuse donnée. Une étude expérimentale ciblant une plateforme parallèle multicœurs et traitant des matrices creuses aléatoires et/ou provenant de problèmes réels permet de valider notre approche et d’évaluer ses performances. Comme travail futur, nous visons la validation de notre approche en utilisant d'autres plateformes parallèles telles que les GPUs
Several applications in scientific computing deals with large sparse matrices having regular or irregular structures. In order to reduce required memory space and computing time, these matrices require the use of a particular data storage structure as well as the use of parallel/distributed target architectures. The choice of the most appropriate compression format generally depends on several factors, such as matrix structure, numerical method and target architecture. Given the diversity of these factors, an optimized choice for one input data set will likely have poor performances on another. Hence the interest of using a system allowing the automatic selection of the Optimal Compression Format (OCF) by taking into account these different factors. This thesis is written in this context. We detail our approach by presenting a design of an auto-tuner system for OCF selection. Given a sparse matrix, a numerical method, a parallel programming model and an architecture, our system can automatically select the OCF. In a first step, we validate our modeling by a case study that concerns (i) Horner scheme, and then the sparse matrix vector product (SMVP), as numerical methods, (ii) CSC, CSR, ELL, and COO as compression formats; (iii) data parallel as a programming model; and (iv) a multicore platform as target architecture. This study allows us to extract a set of metrics and parameters that affect the OCF selection. We note that data parallel metrics are not sufficient to accurately choose the most suitable format. Therefore, we define new metrics involving the number of operations and the number of indirect data access. Thus, we proposed a new decision process taking into account data parallel model analysis and algorithm analysis.In the second step, we propose to use machine learning algorithm to predict the OCF for a given sparse matrix. An experimental study using a multicore parallel platform and dealing with random and/or real-world random matrices validates our approach and evaluates its performances. As future work, we aim to validate our approach using other parallel platforms such as GPUs

APA, Harvard, Vancouver, ISO, and other styles

34

Fernandes, Ronald, Michael Graul, Burak Meric, and Charles H. Jones. "ONTOLOGY-DRIVEN TRANSLATOR GENERATOR FOR DATA DISPLAY CONFIGURATIONS." International Foundation for Telemetering, 2004. http://hdl.handle.net/10150/605328.

Full text

Abstract:

International Telemetering Conference Proceedings / October 18-21, 2004 / Town & Country Resort, San Diego, California
This paper presents a new approach for the effective generation of translator scripts that can be used to automate the translation of data display configurations from one vendor format to another. Our approach uses the IDEF5 ontology description method to capture the ontology of each vendor format and provides simple rules for performing mappings. In addition, the method includes the specification of mappings between a language-specific ontology and its corresponding syntax specification, that is, either an eXtensible Markup Language (XML) Schema or Document Type Description (DTD). Finally, we provide an algorithm for automatically generating eXtensible Stylesheet Language Transformation (XSLT) scripts that transform XML documents from one language to another. The method is implemented in a graphical tool called the Data Display Translator Generator (DDTG) that supports both inter-language (ontology-to-ontology) and intra-language (syntax-to-ontology) mappings and generates the XSLT scripts. The tool renders the XML Schema or DTD as trees, provides intuitive, user-friendly interfaces for performing the mappings, and provides a report of completed mappings. It also generates data type conversion code when both the source and target syntaxes are XML Schema-based. Our approach has the advantage of performing language mappings at an abstract, ontology level, and facilitates the mapping of tool ontologies to a common domain ontology (in our case, Data Display Markup Language or DDML), thereby eliminating the O(n^2) mapping problem that involves a number of data formats in the same domain.

APA, Harvard, Vancouver, ISO, and other styles

35

Nicolai, Andreas. "DELPHIN 6 Climate Data File Specification, Version 1.0." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-221222.

Full text

Abstract:

This paper describes the file format of the climate data container used by the DELPHIN, THERAKLES and NANDRAD simulation programs. The climate data container format holds a binary representation of annual and continuous climatic data needed for hygrothermal transport and building energy simulation models. The content of the C6B-Format is roughly equivalent to the epw-climate data format.

APA, Harvard, Vancouver, ISO, and other styles

36

Wu, Yuanyuan. "HADOOP-EDF: LARGE-SCALE DISTRIBUTED PROCESSING OF ELECTROPHYSIOLOGICAL SIGNAL DATA IN HADOOP MAPREDUCE." UKnowledge, 2019. https://uknowledge.uky.edu/cs_etds/88.

Full text

Abstract:

The rapidly growing volume of electrophysiological signals has been generated for clinical research in neurological disorders. European Data Format (EDF) is a standard format for storing electrophysiological signals. However, the bottleneck of existing signal analysis tools for handling large-scale datasets is the sequential way of loading large EDF files before performing an analysis. To overcome this, we develop Hadoop-EDF, a distributed signal processing tool to load EDF data in a parallel manner using Hadoop MapReduce. Hadoop-EDF uses a robust data partition algorithm making EDF data parallel processable. We evaluate Hadoop-EDF’s scalability and performance by leveraging two datasets from the National Sleep Research Resource and running experiments on Amazon Web Service clusters. The performance of Hadoop-EDF on a 20-node cluster improves 27 times and 47 times than sequential processing of 200 small-size files and 200 large-size files, respectively. The results demonstrate that Hadoop-EDF is more suitable and effective in processing large EDF files.

APA, Harvard, Vancouver, ISO, and other styles

37

Danielsson, Robin. "Jämförelser av MySQL och Apache Spark : För aggregering av smartmätardata i Big Data format för en webbapplikation." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-18667.

Full text

Abstract:

Smarta elmätare är ett område som genererar data i storleken Big Data. Dessa datamängder medför svårigheter att hanteras med traditionella databaslösningar som MySQL. Ett ramverk som uppstått för att lösa dessa svårigheter är Apache Spark som implementerar MapReduce-modellen för klustrade nätverk av datorer. En frågeställning för arbetet är om Apache Spark har fördelar över MySQL på en enskild dator för att hantera stora mängder data i formatet JSON för aggregering mot webbapplikationer. Resultaten i detta arbete visar på att Apache Spark har lägre aggregeringstid än MySQLmot en webbapplikation vid minst ~6.7 GB data i formatet JSON vid mer komplexa aggregeringsfrågor på enskild dator. Resultatet visar även att MySQL lämpar sig bättre än Apache Spark vid enklare aggregeringsfrågor för samtliga datamängder i experimentet.

APA, Harvard, Vancouver, ISO, and other styles

38

Scanlan, JD. "A context aware attack detection system across multiple gateways in real-time." Thesis, Honours thesis, University of Tasmania, 2004. https://eprints.utas.edu.au/117/1/Thesis_Final.pdf.

Full text

Abstract:

It is understood that intrusion detection systems can make more intelligent decisions if the context of the traffic being observed is known. This thesis examines whether an attack detection system, looking at traffic as it arrives at gateways or firewalls, can make smarter decisions if the context of attack patterns across a class of IP addresses is known. A system that detects and forestalls the continuation of both fast attacks and slow attacks across several IP addresses is described and the development of heuristics both to ban activity from hostile IP addresses and then lift these bans is illustrated. The System not only facilitates detection of methodical multiple gateway attacks, but also acts to defeat the attack before penetration can occur across the gateway range.

APA, Harvard, Vancouver, ISO, and other styles

39

Campbell, Daniel A., and Lee Reinsmith. "Telemetry Definition and Processing (TDAP): Standardizing Instrumentation and EU Conversion Descriptions." International Foundation for Telemetering, 1997. http://hdl.handle.net/10150/607584.

Full text

Abstract:

International Telemetering Conference Proceedings / October 27-30, 1997 / Riviera Hotel and Convention Center, Las Vegas, Nevada
Telemetry format descriptions and engineering unit conversion calibrations are generated in an assortment of formats and numbering systems on various media. Usually this information comes to the central telemetry receiving/processing system from multiple sources, fragmented and disjointed. As present day flight tests require more and more telemetry parameters to be instrumented and processed, standardization and automation for handling this ever increasing amount of information becomes more and more critical. In response to this need, the Telemetry Definition and Processing (TDAP) system has been developed by the Air Force Development Test Center (AFDTC) Eglin AFB, Florida. TDAP standardizes the format of information required to convert PCM data and MIL-STD-1553 Bus data into engineering units. This includes both the format of the data files and the software necessary to display, output, and extract subsets of data. These standardized files are electronically available for TDAP users to review/update and are then used to automatically set up telemetry acquisition systems. This paper describes how TDAP is used to standardize the development and operational test community’s telemetry data reduction process, both real-time and post-test.

APA, Harvard, Vancouver, ISO, and other styles

40

Fernandes, Ronald, Michael Graul, John Hamilton, Burak Meric, and Charles H. Jones. "DEVELOPING INTERNAL AND EXTERNAL TRANSLATORS FOR DATA DISPLAY SYSTEMS." International Foundation for Telemetering, 2005. http://hdl.handle.net/10150/604905.

Full text

Abstract:

ITC/USA 2005 Conference Proceedings / The Forty-First Annual International Telemetering Conference and Technical Exhibition / October 24-27, 2005 / Riviera Hotel & Convention Center, Las Vegas, Nevada
The focus of this paper is to describe a unified methodology for developing both internal and external data display translators between an Instrumentation Support System (ISS) format and Data Display Markup Language (DDML), a neutral language for describing data displays. The methodology includes aspects common to both ISSs that have a well documented text-based save format and those that do not, as well as aspects that are unique to each type. We will also describe the means by which an external translator can be integrated into a translator framework. Finally, we will describe how an internal translator can be integrated directly into the ISS.

APA, Harvard, Vancouver, ISO, and other styles

41

Bierza, Daniel. "Editor pasportizace VUT." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-237477.

Full text

Abstract:

I will present the issue of passportization in my work. I will analyze the current status of the BUT buildings. I will describe possible solutions of passportization at the BUT in the future. I will focus on the analysis of the "obr" format through the method of reverse engineering. I will do the analysis of the acquired data. I will describe the way of saving of the passportization information. I will design a graphic browser and a passportization editor.

APA, Harvard, Vancouver, ISO, and other styles

42

Meuel, Peter. "Insertion de données cachées dans des vidéos au format H. 264." Montpellier 2, 2009. http://www.theses.fr/2009MON20218.

Full text

Abstract:

Cette thèse adresse deux problèmes majeurs rencontrés par l'adoption massive du format vidéo H. 264: la protection de la vie privée dans la vidéo-surveillance et le besoin de méthodes sûres pour le tatouage robuste. Une première contribution pour la t permet d'obtenir un unique flux vidéo compatible H. 264 qui restreint l'accès aux visages filmées aux seules personnes détentrices d'une clé de chiffrement. La performance des résultats obtenus valident la possibilité d'utilisation en temps-réel dans des caméras de vidéo-surveillance Une seconde contribution concernant le tatouage robuste a pour base l'état de l'art du tatouage dit sûr appliqué aux vidéos. A l'inverse d'un chiffrement, la sécurité de la méthode provient directement de l'insertion de données dans un espace secret. Les travaux détaillent toutes les étapes d'adaptation et d'optimisation de cette méthode aux vidéos au format H. 264
This thesis targets two major issues cause by the massive adoption of the H. 264 video format: the privacy issue with closed-circuit television and the need of secure and robust watermarking methods for the video content. A first contribution adresses the privacy issue achieve the creation of a single video flow wich restraint the visual information of the filmed faces only to persons with the appropriate key. Performances of the results show the usability of the method in video-camera. The second contribution about the robust watermarking uses the secure watermarking state-of-the-art applied to video. On the opposite of crypting, the security of the method relies on the secret subspace for the insertion. The work explains the entire process for an adaptation to the H. 264 video format

APA, Harvard, Vancouver, ISO, and other styles

43

Edman, Fredrik. "Läsa och lagra data i JSON format för smart sensor : En jämförelse i svarstid mellan hybriddatabassystemet PostgreSQL och MongoDB." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-15482.

Full text

Abstract:

Sociala media genererar stora mängder data men det finns fler saker som gör det och lagrar i NoSQL databassystem och smarta sensorer som registrerar elektrisk förbrukning är en av de. MongoDB är ett NoSQL databassystem som lagrar sin data i dataformatet JSONB. PostgreSQL som är ett SQL databassystem har i sina senare distributioner också börjat hantera JSONB. Det gör att PostgreSQL är en typ av hybrid då den hanterar operationer för både SQL och NoSQL. I denna studie gjordes ett experiment för att se hur dessa databassystem hanterar data för att läsa och skriva när det gäller JSON för smarta sensorer. Svarstider registrerades och försökte svara på hypotesen om PostgreSQL kan vara lämplig för att läsa och skriva JSON data som genereras av en smart sensor. Experimentet påvisade att PostgreSQL inte ökar svarstid markant när mängden data ökar för insert men för MongoDB gör det. Svaret på hypotesen om PostgreSQL kan vara lämplig för JSON data är att det är det möjligt att den kan vara det men svårt att svara på och ytterligare forskning behövs.

APA, Harvard, Vancouver, ISO, and other styles

44

Stensmar, Isak. "Steganografi i bilder : En studie om bildformat och visuella bildrepresentationens påverkan vid lagring av data med hjälp av en steganografisk metod." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-12887.

Full text

Abstract:

Sammanhang. Genom att använda steganografi i bilder är det möjligt att lagra en stor mängd data utan att påverka bilden som information lagras i. En vanlig metod som används inom steganografi är LSB (Least Significant Bit), som ofta anses vara en av de första metoderna som användes inom steganografi. Förutom valet av metod har personen ett val att göra när hen ska välja en bild som ska representera bärare av information. Vad man ofta försöker åstadkomma är att skapa en komplex metod men glömmer bort bilden som ska användas som bärare. I slutändan är det bilden som kommer att användas och testas vid olika mätningar. Mål. Den här studien kommer att undersöka om olika typer av bildformat, BMP, PNG, JPEG och TIFF, har någon påverkan när jämförelse görs av originalbilden och den modifierade, givet att en steganografisk metod används för att lagra informationen. Studien kommer även att undersöka om bildrepresentationen har någon påverkan på mätningarna. Metod. En utbyggd metod av Least Significant Bit metoden kommer att implementeras och användas för att lagra information i olika typer av bilder med olika bildformat. Ett experiment sätts upp för att undersöka formaten med hjälp av mätningsverktygen MSE (Mean Squared Error), PSNR (Peek Signal-to-Noise Ratio) och SSIM (Structural Similarity). Resultat. Vid jämförelse av de grafer och tabeller som togs fram, sågs JPEG ha ett bättre resultat genom att ha ett lägre differentiellt värde mellan varje test. BMP, PNG och TIFF hade minimala skillnader mellan varandra för varje test. För bildrepresentationen visade två bilder högre differentiellt värde än de resterande tre. Sammanfattning. Resultaten från experimentet visade att vilken komprimeringsmetod som ett bildformat använder kommer ha påverkan på mätningsvärdena. Resultaten visade också att bildrepresentation kan ha en påverkan på mätningarna av en bild men mer data behövs för att dra en slutsats.
Context. By using image steganography it is possible to hide a large amount of data without making big differences to the initial picture. One commonly used method is Least Significant Bit (LSB), which often is considered one of the first method implemented and used in Image Steganography. Apart from the method, the user also have a choice when deciding what picture he or she should use as the carrier of information. What people often try to accomplish is to have a very complex method that hides the data in an efficient way, but forgets about the picture used as a carrier. In the end, all measurements will be done on the picture. Objectives. This study will investigate if different image formats, BMP, PNG, JPEG and TIFF, have an impact on the differences when comparing the original picture with the modified, given that data is stored with a steganographic method and is gradually increased. The study will also investigate if what the picture visually represent will have an effect on the measurements. Methods. An extended method of the Least Significant Bit method will be implemented and used to create different pictures with different kinds of image formats. An experiment will investigate these formats by taking measurements with MSE (Mean Squared Error), PSNR (Peek Signal-to-Noise Ratio) and SSIM (Structural Similarity). Results. When comparing different formats one could say that JPEG showed better performance by having a lower differential value between each test, by looking at the graphs and tables. BMP, PNG and TIFF had minimal changes between each other for each test. As for the visual representation of the pictures, two pictures showed a higher differential value after each test than the remaining three. Conclusions. The results from the experiment showed that which compression method a format uses will have an impact on the measurement. The results also showed that the pictures’ visual representation could have some impact on the measurement of a picture but more data is needed to conclude this theory.

APA, Harvard, Vancouver, ISO, and other styles

45

Rådeström, Johan, and Gustav Skoog. "Realtidssammanställning av stora mängder data från tidsseriedatabaser." Thesis, KTH, Data- och elektroteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-208932.

Full text

Abstract:

Stora mängder tidsseriedata genereras och hanteras i tekniska försörjningssystem och processindustrier i syfte att möjliggöra övervakning av systemen. När tidserierna ska hämtas och sammanställas för dataanalys utgör tidsåtgången ett problem. Examensarbetet hade som syfte att ta reda på hur utvinning av tidsseriedata borde utföras för att ge bästa möjliga svarstid för systemen. För att göra hämtningen och sammanställningen så effektiv som möjligt testades och utvärderades olika tekniker och metoder. De områden som tekniker och metoder jämfördes inom var sammanställning av data inom och utanför databasen, cachning, användandet av minnesdatabaser jämfört med andra databaser, dataformat, dataöverföring, och förberäkning av data. Resultatet var att den bästa lösningen bestod av att sammanställa data parallellt utanför databasen, att använda en egen inbyggd minnesdatabas, att använda Google Protobuf som dataformat, samt att förberäkna data.
Large amounts of time series data are generated and managed within management systems and industries with the purpose to enable monitoring of the systems. When the time series is to be acquired and compiled for data analysis, the expenditure of time is a problem. This thesis was purposed to determine how the extraction of time series data should be performed to give the systems the best response time possible. To make the extraction and compilation as effective as possible, different techniques and methods were tested and evaluated. The areas that techniques and methods were compared for were compilation of data inside and outside the database, caching, usage of in-memory databases compared to other databases, dataformats, data transfer, and precalculation of data. The results showed that the best solution was to compile data in parallel outside the database, to use a custom built-in in-memory database, to use Google Protobuf as data format, and finally to use precalculated data.

APA, Harvard, Vancouver, ISO, and other styles

46

Xu, Guorong. "Computational Pipeline for Human Transcriptome Quantification Using RNA-seq Data." ScholarWorks@UNO, 2011. http://scholarworks.uno.edu/td/343.

Full text

Abstract:

The main theme of this thesis research is concerned with developing a computational pipeline for processing Next-generation RNA sequencing (RNA-seq) data. RNA-seq experiments generate tens of millions of short reads for each DNA/RNA sample. The alignment of a large volume of short reads to a reference genome is a key step in NGS data analysis. Although storing alignment information in the Sequence Alignment/Map (SAM) or Binary SAM (BAM) format is now standard, biomedical researchers still have difficulty accessing useful information. In order to assist biomedical researchers to conveniently access essential information from NGS data files in SAM/BAM format, we have developed a Graphical User Interface (GUI) software tool named SAMMate to pipeline human transcriptome quantification. SAMMate allows researchers to easily process NGS data files in SAM/BAM format and is compatible with both single-end and paired-end sequencing technologies. It also allows researchers to accurately calculate gene expression abundance scores.

APA, Harvard, Vancouver, ISO, and other styles

47

Munir, Rana Faisal [Verfasser], Wolfgang [Akademischer Betreuer] Lehner, Alberto [Akademischer Betreuer] Abelló, and Oscar [Akademischer Betreuer] Romero. "Storage Format Selection and Optimization for Materialized Intermediate Results in Data-Intensive Flows / Rana Faisal Munir ; Wolfgang Lehner, Alberto Abelló, Oscar Romero." Dresden : Technische Universität Dresden, 2021. http://d-nb.info/1231847670/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Kalibjian, Jeff. "Data Security Architecture Considerations for Telemetry Post Processing Environments." International Foundation for Telemetering, 2017. http://hdl.handle.net/10150/626950.

Full text

Abstract:

Telemetry data has great value, as setting up a framework to collect and gather it involve significant costs. Further, the data itself has product diagnostic significance and may also have strategic national security importance if the product is defense or intelligence related. This potentially makes telemetry data a target for acquisition by hostile third parties. To mitigate this threat, data security principles should be employed by the organization to protect telemetry data. Data security is in an important element of a layered security strategy for the enterprise. The value proposition centers on the argument that if organization perimeter/internal defenses (e.g. firewall, IDS, etc.) fail enabling hostile entities to be able to access data found on internal company networks; they will be unable to read the data because it will be encrypted. After reviewing important encryption background including accepted practices, standards, and architectural considerations regarding disk, file, database and application data protection encryption strategies; specific data security options applicable to telemetry post processing environments will be discussed providing tangible approaches to better protect organization telemetry data.

APA, Harvard, Vancouver, ISO, and other styles

49

Siraskar, Nandkumar S. "Adaptive Slicing in Additive Manufacturing Process using a Modified Boundary Octree Data Structure." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1353155811.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Vogelsang, Stefan, Heiko Fechner, and Andreas Nicolai. "Delphin 6 Material File Specification." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2013. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-126274.

Full text

Abstract:

This paper describes the format of material data files that hold parameters needed by thermal and hygrothermal simulation tools such as Delphin, Hajawee (Dynamic Room Model) and Nandrad. The Material Data Files are containers for storing parameters and functions for heat and moisture transport and storage models. The article also discusses the application programming interface of the Material library that can be used to read/write material data files conveniently and efficiently.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Data Format'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles