To see the other types of publications on this topic, follow the link: ANALYZING DATA.

Dissertations / Theses on the topic 'ANALYZING DATA'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'ANALYZING DATA.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Ju, Hyunsu. "Topics in analyzing longitudinal data." Texas A&M University, 2004. http://hdl.handle.net/1969.1/1565.

Full text
Abstract:
We propose methods for analyzing longitudinal data, obtained in clinical trials and other applications with repeated measures of responses taken over time. Common characteristics of longitudinal studies are correlated responses and observations taken at unequal points in time. The first part of this dissertation examines the justification of a block bootstrap procedure for the repeated measurement designs, which takes into account the dependence structure of the data by resampling blocks of adjacent observations rather than individual data points. In the case of dependent stationary data, under regular conditions, the approximately studentized or standardized block bootstrap possesses a higher order of accuracy. With longitudinal data, the second part of this dissertation shows that the diagonal optimal weights for unbalanced designs can be made to improve the efficiency of the estimators in terms of mean squared error criterion. Simulation study is conducted for each of the longitudinal designs. We will also analyze repeated measurement data set concerning nursing home residents with multiple sclerosis, which is obtained from a large database termed the minimum data set (MDS).
APA, Harvard, Vancouver, ISO, and other styles
2

Roberg, Abigail M. "Data Visualizations: Guidelines for Gathering, Analyzing, and Designing Data." Ohio University Honors Tutorial College / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ouhonors1524826335755109.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Lau, Ho-yin Eric. "Statistical methods for analyzing epidemiological data." Click to view the E-thesis via HKUTO, 2005. http://sunzi.lib.hku.hk/hkuto/record/B34829969.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Lau, Ho-yin Eric, and 劉浩然. "Statistical methods for analyzing epidemiological data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B34829969.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Stetcenko, D. O., and Y. V. Smityuh. "Intellectual Data Analyzing Using Wavele TTransformation." Thesis, Sumy State University, 2016. http://essuir.sumdu.edu.ua/handle/123456789/47136.

Full text
Abstract:
This article is about bragorectification setting as a complex object of regulation, operating under uncertainty.From the viewpoint of analysis and synthesis automatic control BRS is complex machine of consistent-parallel structure.It is proved that automatic control systems analysis and synthesis BRS of alcohol plants are multifunction objects. The analysis of existing control algorithms, discussed the advantages and disadvantages of these algorithms. Properties of automated rectification device are showed through interconnection of input options causes changes of output parameters.
APA, Harvard, Vancouver, ISO, and other styles
6

Pan, Feng Wang Wei. "Efficient algorithms in analyzing genomic data." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2009. http://dc.lib.unc.edu/u?/etd,2622.

Full text
Abstract:
Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2009.
Title from electronic title page (viewed Oct. 5, 2009). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science." Discipline: Computer Science; Department/School: Computer Science.
APA, Harvard, Vancouver, ISO, and other styles
7

Björck, Olof. "Analyzing gyro data based image registration." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-397459.

Full text
Abstract:
An analysis of gyro sensor data with regards to rotational image registration is conducted in this thesis. This is relevant for understanding how well images captured with a moving camera can be registered using only gyro sensor data as motion input. This is commonly the case for Electronic Image Stabilization (EIS) in handheld devices. The theory explaining how to register images based on gyro sensor data is presented, a qualitative analysis of gyro sensor data from three generic Android smartphones is conducted, and rotational image registration simulations using simulated noise as well as real gyro sensor data from the smartphones are presented. An accuracy metric for rotational image registration is presented that measures image registration accuracy in pixels (relevant for frame to frame image registration) or pixels per second (relevant for video EIS). This thesis shows that noise in gyro sensor data affects image registration accuracy to an extent that is noticeable in 1080x1920 resolution video displayed on larger monitors such as a computer monitor or when zooming digitally, but not to any significant extent displayed on a monitor the size of a regular smartphone display without zooming. Different screen resolutions and frame rates will affect the image registration accuracy and would be interesting to investigate in further work. Ways to improve the gyro sensor data would also be interesting to investigate.
APA, Harvard, Vancouver, ISO, and other styles
8

Hossain, Abu. "General methods for analyzing bounded proportion data." Thesis, London Metropolitan University, 2017. http://repository.londonmet.ac.uk/1243/.

Full text
Abstract:
This thesis introduces two general classes of models for analyzing proportion response variable when the response variable Y can take values between zero and one, inclusive of zero and/or one. The models are inflated GAMLSS model and generalized Tobit GAMLSS model. The inflated GAMLSS model extends the flexibility of beta inflated models by allowing the distribution on (0,1) of the continuous component of the dependent variable to come from any explicit or transformed (i.e. logit or truncated) distribution on (0,1) including highly skewed and/or kurtotic or bimodal distributions. The second proposed general class of model is the generalized Tobit GAMLSS model. The generalized Tobit GAMLSS model relaxes the underlying normal distribution assumption of the latent variable in the Tobit model to a very general class of distribution on the real line. The thesis also provides likelihood inference and diagnostic and model selection tools for these classes of models. Applications of both the models are conducted using different sets of data to check the robustness of the proposed models. The originality of the thesis starts from chapter 4 and in particular chapter 5, 6 and 7 with applications of models in chapter 8, 9 and 10.
APA, Harvard, Vancouver, ISO, and other styles
9

Ho, Wai-shing. "Techniques for managing and analyzing unconventional data." Click to view the E-thesis via HKUTO, 2004. http://sunzi.lib.hku.hk/hkuto/record/B39849028.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ho, Wai-shing, and 何偉成. "Techniques for managing and analyzing unconventional data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B39849028.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

McDermott, Matthew. "Fast Algorithms for Analyzing Partially Ranked Data." Scholarship @ Claremont, 2014. http://scholarship.claremont.edu/hmc_theses/58.

Full text
Abstract:
Imagine your local creamery administers a survey asking their patrons to choose their five favorite ice cream flavors. Any data collected by this survey would be an example of partially ranked data, as the set of all possible flavors is only ranked into subsets of the chosen flavors and the non-chosen flavors. If the creamery asks you to help analyze this data, what approaches could you take? One approach is to use the natural symmetries of the underlying data space to decompose any data set into smaller parts that can be more easily understood. In this work, I describe how to use permutation representations of the symmetric group to create and study efficient algorithms that yield such decompositions.
APA, Harvard, Vancouver, ISO, and other styles
12

Dumas, Raphaël A. (Raphaël Antoine). "Analyzing transit equity using automatically collected data." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/103650.

Full text
Abstract:
Thesis: M.C.P., Massachusetts Institute of Technology, Department of Urban Studies and Planning, 2015.
Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 145-148).
By inferring individual passengers' origins, destinations, and transfers using automatically collected transit data, transit providers can obtain and analyze larger volumes of information, with more accuracy, and at more frequent intervals than are available through traditional origin-destination (OD) surveys. Automatic OD inference can be an input into the analysis and reporting of agencies' social goals, such as the provision of equitable service regardless of race, national origin, or ethnicity, which is federally required in the USA by Title VI of the Civil Rights Act of 1964. The methodology prescribed in the Title VI regulation, however, has not adapted to the opportunity to supplement supply metrics with passenger-centric demand metrics through the availability of OD data. The goal of this thesis is to demonstrate a preliminary methodology to link automatically inferred OD information from regular transit users to the demographic data of public transit commuters from the US Census's American Community Survey, and to examine variation in passenger-centric metrics such as journey time and speed. This study infers origins and destinations in the context of the Massachusetts Bay Transportation Authority (MBTA). From a sample month of these data, an example of a passenger-centric analysis is performed by comparing travel times and speeds of trips with origins in areas home to predominantly Black or African American transit commuters to travel times and speeds of trips with origins in areas home to predominantly White transit commuters. Commuters from predominantly Black or African American census tracts are found to have longer travel times and slower speeds relative to commuters from tracts where commuters are predominantly White. Differences are within agency specified margins, but are significant, in particular for journeys involving bus transfers. Short-term solutions such as through-routing of important bus routes and increasing reliability of bus departures at terminals and long-term solutions such as faster, more frequent Diesel Multiple Unit rail service are proposed and evaluated to mitigate these differences.
by Raphaël A. Dumas.
M.C.P.
S.M. in Transportation
APA, Harvard, Vancouver, ISO, and other styles
13

Lambert, Michel Joseph. "Visualizing and analyzing human-centered data streams." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/33301.

Full text
Abstract:
Thesis (M. Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.
Includes bibliographical references (p. 71-73).
The mainstream population is readily adapting to the notion that the carrying of mobile computational devices such as cell phones and PDAs on one's person is as essential as taking along one's watch or credit cards. In addition to their stated and oftentimes proprietary functionality, these technological innovations have the potential to also function as powerful sensory data collectors. These devices are able to record and store a variety of data about their owner's everyday activities, a new development that may significantly impact the way we recall information. Human memory, with its limitations and subjective recall of events, may now be supplemented by the latent potential of these in-place devices to accurately record one's daily activities, thereby giving us access to a wealth of information about our own lives. In order to make use of this recorded information, it must be presented in an easily understood format: timelines have been a traditional display metaphor for this type of data. This thesis explores the visualization and navigation schemes available for these large temporal data sets, and the types of analyzation that they facilitate.
by Michel Joseph Lambert.
M.Eng.and S.B.
APA, Harvard, Vancouver, ISO, and other styles
14

Huotari, N. (Niko). "Graphical user interface for analyzing radiological data." Master's thesis, University of Oulu, 2016. http://urn.fi/URN:NBN:fi:oulu-201606042269.

Full text
Abstract:
Brain research is increasingly focusing on critically sampled multimodal data. Due to the complexity of the brain multiple measures are analyzed simultaneously to bring forth a more comprehensive picture of brain functions. Furthermore the data has markedly increased in size, which places new demands for analysis tools. This master’s thesis presents a MRI-compatible multimodal measurement arrangement, a Hepta-scan concept and a toolbox (Nifty) for analyzing the measurements. The concept measures brain (MREG), non-invasive blood pressure (NIBP), electroencephalography (EEG), near infrared spectroscopy (NIRS) and anesthesia data in synchrony. Nifty combines several existing and newly developed software to create a simple access point for all available tools. It includes a database which holds information of a large amount of data obtained in the multimodal measurements. This thesis presents the software and hardware parts of the Hepta-scan concept and explains the workflow in it. Finally the Nifty toolbox design is presented and the functionality of it explained
Aivotutkimus keskittyy entistä enemmän kriittisesti näytteistettyyn multimodaalisen dataan. Aivojen monimutkaisuus vaatii useiden mittareiden analysointia samanaikaisesti, jotta saadaan kattava kuva aivojen toiminnasta. Lisäksi aiempaa tarkempi kuvantaminen lisää datan määrää, mikä asettaa uusia vaatimuksia analyysityökaluille. Tämä diplomityö esittää MRI -yhteensopivan multimodaalisen mittausjärjestelmän, Hepta-scan konseptin ja työkalupaketin (Nifty) mittausten analysointiin. Konsepti mittaa aivoja (MREG), noninvasiivista verenpainetta (NIBP), aivosähkökäyrää (EEG), lähi-infrapunaspektroskopiaa (NIRS) ja anestesiadataa synkronoidusti. Nifty yhdistää useita olemassa olevia ja uusia kehitettyjä ohjelmia, jotka muodostavat yksinkertaisen käynnistyspisteen kaikille työkaluille. Se sisältää tietokantajärjestelmän, joka pitää yllä informaatiota multimodaalisista mittauksista. Tämä työ esittää ohjelmisto- ja laitteistopuolen Hepta-scan konseptista, ja selittää sen työnkulun. Lopuksi työkalupaketti, Niftyn rakenne esitetään, ja sen toiminnot selitetään
APA, Harvard, Vancouver, ISO, and other styles
15

Zuo, Zhiya. "Analyzing collaboration with large-scale scholarly data." Diss., University of Iowa, 2019. https://ir.uiowa.edu/etd/7055.

Full text
Abstract:
We have never stopped in the pursuit of science. Standing on the shoulders of the giants, we gradually make our path to build a systematic and testable body of knowledge to explain and predict the universe. Emerging from researchers’ interactions and self-organizing behaviors, scientific communities feature intensive collaborative practice. Indeed, the era of lone genius has long gone. Teams have now dominated the production and diffusion of scientific ideas. In order to understand how collaboration shapes and evolves organizations as well as individuals’ careers, this dissertation conducts analyses at both macroscopic and microscopic levels utilizing large-scale scholarly data. As self-organizing behaviors, collaborations boil down to the interactions among researchers. Understanding collaboration at individual level, as a result, is in fact a preliminary and crucial step to better understand the collective outcome at group and organization level. To start, I investigate the role of research collaboration in researchers’ careers by leveraging person-organization fit theory. Specifically, I propose prospective social ties based on faculty candidates’ future collaboration potential with future colleagues, which manifests diminishing returns on the placement quality. Moving forward, I address the question of how individual success can be better understood and accurately predicted utilizing their collaboration experience data. Findings reveal potential regularities in career trajectories for early-stage, mid-career, and senior researchers, highlighting the importance of various aspects of social capital. With large-scale scholarly data, I propose a data-driven analytics approach that leads to a deeper understanding of collaboration for both organizations and individuals. Managerial and policy implications are discussed for organizations to stimulate interdisciplinary research and for individuals to achieve better placement as well as short and long term scientific impact. Additionally, while analyzed in the context of academia, the proposed methods and implications can be generalized to knowledge-intensive industries, where collaboration are key factors to performance such as innovation and creativity.
APA, Harvard, Vancouver, ISO, and other styles
16

Jansen, Steven G. "3, 2, 1 blastoff analyzing data through rocketry /." Menomonie, WI : University of Wisconsin--Stout, 2006. http://www.uwstout.edu/lib/thesis/2006/2006jansens.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Bari, Wasimul. "Analyzing binary longitudinal data in adaptive clinical trials /." Internet access available to MUN users only, 2003. http://collections.mun.ca/u?/theses,167453.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Donnelly-Boyce, Courtney. "Method for analyzing juvenile growth data across populations." Connect to resource, 2008. http://hdl.handle.net/1811/32232.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Sibley, Christy N. "Analyzing Navy Officer Inventory Projection Using Data Farming." Thesis, Monterey, California. Naval Postgraduate School, 2012. http://hdl.handle.net/10945/6868.

Full text
Abstract:
Approved for public release, distribution unlimited
The Navys Strategic Planning and Analysis Directorate (OPNAV N14) uses a complex model to project officer status in the coming years. The Officer Strategic Analysis Model (OSAM) projects officer status using an initial inventory, historical loss rates, and dependent functions for accessions, losses, lateral transfers, and promotions that reflect Navy policy and U.S. law. OSAM is a tool for informing decision makers as they consider potential policy changes, or analyze the impact of policy changes already in place, by generating Navy Officer inventory projections for a specified time horizon. This research explores applications of data farming for potential improvement of OSAM. An analysis of OSAM inventory forecast variations over a large number of scenarios while changing multiple input parameters enables assessment of key inputs. This research explores OSAM through applying the principles of design of experiments, regression modeling, and nonlinear programming. The objectives of this portion of the work include identifying critical parameters, determining a suitable measure of effectiveness, assessing model sensitivities, evaluating performance across a spectrum of loss adjustment factors, and determining appropriate values of key model inputs for future use in forecasting Navy officer inventory.
APA, Harvard, Vancouver, ISO, and other styles
20

Berrar, Daniel. "Machine learning methods for analyzing DNA microarray data." Thesis, University of Ulster, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.414098.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Andersen, Niklas. "Analyzing the impact of data compression in Hive." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-269235.

Full text
Abstract:
Executing expensive queries over many large tables can be prohibitively time consuming in conventional relational databases. Hadoop and its data warehouse Hive is a powerful alternative for large scale data processing. Conventionally, data is stored in Hive without compression. There is value in storing the data with compression, if the overhead of compression does not negatively impact the query processing time. This paper describes through experiments using imports, transformations and exports of Hive data in various file formats and with different compression techniques how this can be achieved.
APA, Harvard, Vancouver, ISO, and other styles
22

Korhonen, H. (Heikki). "Tool for analyzing data transfer scenarios in eNodeB." Master's thesis, University of Oulu, 2016. http://urn.fi/URN:NBN:fi:oulu-201609142780.

Full text
Abstract:
In software development, debugging is one option for finding a bug. Source code can be debugged by entering print statements to investigate values of variables or by using a dedicated debugger tool. With a debugger, the program can be stopped at a certain point and see the values of variables, without changing the code. Real-time software code is complex. Complex source code always requires careful testing, design and quality assurance. Debugging helps to achieve these requirements. Debugging is harder in a real-time environment and it takes more time which means that developers must have effective debugging tools. To be effective in debugging in a real-time environment, it requires an informative logging tool. This thesis concentrates to help LTE L2 debugging with the tool implemented in this work. The logging tool parses the binary data got from eNodeB to a readable form in a text file. Traced fields and values can be investigated in a certain time. With this L2 data flow can be verified
Ohjelmistokehityksessä virheenjäljittämistä käytetään vian löytämiseen. Virheenjäljitystä voidaan tehdä lisäämällä lähdekoodin tulostuslauseita, joilla tutkitaan esimerkiksi muuttujien arvoa halutulla hetkellä koodissa. Toinen tapa on virheenjäljittäjän käyttäminen koodia ajettaessa. Silloin ohjelma voidaan pysäyttää haluttuun kohtaan ja tutkia muuttujien sen hetkisiä arvoja ilman koodimuutoksia. Reaaliaikainen koodi on kompleksista ja vaatii aina huolellista testausta sekä laadunvarmistusta. Virheenjäljitys on reaaliaikaisessa ympäristössä hankalampaa ja aikaa vievää, jolloin ohjelmistokehittäjillä täytyy olla tehokkaat virheenjäljitystyökalut. Reaaliaikaisessa ohjelmistossa tehokas virheenjäljitys vaatii myös informatiivisen lokityökalun. Tämä diplomityö keskittyy auttamaan LTE L2 virheenjäljitystä työssä toteutettavan lokityökalun avulla. Lokityökalu purkaa eNodeB-tukiasemasta saadut binääritiedostot lukemiskelpoiseen muotoon tekstitiedostoon. Tekstitiedostosta voidaan tutkia halutulla ajanhetkellä olevien jäljitettyjen muuttujien arvoja. Tällä voidaan varmistaa, onko LTE L2:n tiedonvirtaus sujunut onnistuneesti
APA, Harvard, Vancouver, ISO, and other styles
23

Gaines, Tommi Lynn. "Statistical methods for analyzing multiple race response data." Diss., Restricted to subscribing institutions, 2008. http://proquest.umi.com/pqdweb?did=1580805511&sid=5&Fmt=2&clientId=1564&RQT=309&VName=PQD.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Kumar, Dharmendra. "A COMPUTATIONALLY EFFICIENT METHOD OF ANALYZING THE PARAMETRIC SUBSTRUCTURES." Thesis, The University of Arizona, 1985. http://hdl.handle.net/10150/275395.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Chava, Gopi Krishna. "Analyzing pressure and temperature data from smart plungers to optimize lift cycles." [College Station, Tex. : Texas A&M University, 2008. http://hdl.handle.net/1969.1/ETD-TAMU-3217.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Flöter, André. "Analyzing biological expression data based on decision tree induction." [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=978444728.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Flöter, André. "Analyzing biological expression data based on decision tree induction." Phd thesis, Universität Potsdam, 2005. http://opus.kobv.de/ubp/volltexte/2006/641/.

Full text
Abstract:

Modern biological analysis techniques supply scientists with various forms of data. One category of such data are the so called "expression data". These data indicate the quantities of biochemical compounds present in tissue samples.

Recently, expression data can be generated at a high speed. This leads in turn to amounts of data no longer analysable by classical statistical techniques. Systems biology is the new field that focuses on the modelling of this information.

At present, various methods are used for this purpose. One superordinate class of these meth­ods is machine learning. Methods of this kind had, until recently, predominantly been used for classification and prediction tasks. This neglected a powerful secondary benefit: the ability to induce interpretable models.

Obtaining such models from data has become a key issue within Systems biology. Numerous approaches have been proposed and intensively discussed. This thesis focuses on the examination and exploitation of one basic technique: decision trees.

The concept of comparing sets of decision trees is developed. This method offers the pos­sibility of identifying significant thresholds in continuous or discrete valued attributes through their corresponding set of decision trees. Finding significant thresholds in attributes is a means of identifying states in living organisms. Knowing about states is an invaluable clue to the un­derstanding of dynamic processes in organisms. Applied to metabolite concentration data, the proposed method was able to identify states which were not found with conventional techniques for threshold extraction.

A second approach exploits the structure of sets of decision trees for the discovery of com­binatorial dependencies between attributes. Previous work on this issue has focused either on expensive computational methods or the interpretation of single decision trees ­ a very limited exploitation of the data. This has led to incomplete or unstable results. That is why a new method is developed that uses sets of decision trees to overcome these limitations.

Both the introduced methods are available as software tools. They can be applied consecu­tively or separately. That way they make up a package of analytical tools that usefully supplement existing methods.

By means of these tools, the newly introduced methods were able to confirm existing knowl­edge and to suggest interesting and new relationships between metabolites.


Neuere biologische Analysetechniken liefern Forschern verschiedenste Arten von Daten. Eine Art dieser Daten sind die so genannten "Expressionsdaten". Sie geben die Konzentrationen biochemischer Inhaltsstoffe in Gewebeproben an.

Neuerdings können Expressionsdaten sehr schnell erzeugt werden. Das führt wiederum zu so großen Datenmengen, dass sie nicht mehr mit klassischen statistischen Verfahren analysiert werden können. "System biology" ist eine neue Disziplin, die sich mit der Modellierung solcher Information befasst.

Zur Zeit werden dazu verschiedenste Methoden benutzt. Eine Superklasse dieser Methoden ist das maschinelle Lernen. Dieses wurde bis vor kurzem ausschließlich zum Klassifizieren und zum Vorhersagen genutzt. Dabei wurde eine wichtige zweite Eigenschaft vernachlässigt, nämlich die Möglichkeit zum Erlernen von interpretierbaren Modellen.

Die Erstellung solcher Modelle hat mittlerweile eine Schlüsselrolle in der "Systems biology" erlangt. Es sind bereits zahlreiche Methoden dazu vorgeschlagen und diskutiert worden. Die vorliegende Arbeit befasst sich mit der Untersuchung und Nutzung einer ganz grundlegenden Technik: den Entscheidungsbäumen.

Zunächst wird ein Konzept zum Vergleich von Baummengen entwickelt, welches das Erkennen bedeutsamer Schwellwerte in reellwertigen Daten anhand ihrer zugehörigen Entscheidungswälder ermöglicht. Das Erkennen solcher Schwellwerte dient dem Verständnis von dynamischen Abläufen in lebenden Organismen. Bei der Anwendung dieser Technik auf metabolische Konzentrationsdaten wurden bereits Zustände erkannt, die nicht mit herkömmlichen Techniken entdeckt werden konnten.

Ein zweiter Ansatz befasst sich mit der Auswertung der Struktur von Entscheidungswäldern zur Entdeckung von kombinatorischen Abhängigkeiten zwischen Attributen. Bisherige Arbeiten hierzu befassten sich vornehmlich mit rechenintensiven Verfahren oder mit einzelnen Entscheidungsbäumen, eine sehr eingeschränkte Ausbeutung der Daten. Das führte dann entweder zu unvollständigen oder instabilen Ergebnissen. Darum wird hier eine Methode entwickelt, die Mengen von Entscheidungsbäumen nutzt, um diese Beschränkungen zu überwinden.

Beide vorgestellten Verfahren gibt es als Werkzeuge für den Computer, die entweder hintereinander oder einzeln verwendet werden können. Auf diese Weise stellen sie eine sinnvolle Ergänzung zu vorhandenen Analyswerkzeugen dar.

Mit Hilfe der bereitgestellten Software war es möglich, bekanntes Wissen zu bestätigen und interessante neue Zusammenhänge im Stoffwechsel von Pflanzen aufzuzeigen.

APA, Harvard, Vancouver, ISO, and other styles
28

Serpeka, Rokas. "Analyzing and modelling exchange rate data using VAR framework." Thesis, KTH, Matematik (Inst.), 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-94180.

Full text
Abstract:
Abstract   In this report analysis of foreign exchange rates time series are performed. First, triangular arbitrage is detected and eliminated from data series using linear algebra tools. Then Vector Autoregressive processes are calibrated and used to replicate dynamics of exchange rates as well as to forecast time series. Finally, optimal portfolio of currencies with minimal Expected Shortfall is formed using one time period ahead forecasts
APA, Harvard, Vancouver, ISO, and other styles
29

Liu, Kejun. "Software and Methods for Analyzing Molecular Genetic Marker Data." NCSU, 2003. http://www.lib.ncsu.edu/theses/available/etd-07182003-122001/.

Full text
Abstract:
Genetic analysis of molecular markers has allowed biologists to ask a wide variety of questions. This dissertation explores some aspects of the statistical and computational issues used in the genetic marker data analysis. Chapter 1 gives an introduction to genetic marker data, as well as a brief description to each chapter. Chapter 2 presents the different genetic analyses performed on a large data set and discusses the use of microsatellites to describe the maize germplasm and to improve maize germplasm maintenance. Considerable attention is focused on how the maize germplasm is organized and genetic variation is distributed. A novel maximum likelihood method is developed to estimate the historical contributions for maize inbred lines. Chapter 3 covers a new method for optimal selection of a core set of lines from a large germplasm collection. The simulated annealing algorithm for choosing an optimal k-subset is described and evaluated using the maize germplasm as an example; general constraints are incorporated in the algorithm, and the efficiency of the algorithms is compared to existing methods. Chapter 4 covers a two-stage strategy to partition a chromosomal region into blocks with extensive within-block linkage disequilibrium, and to select the optimal subset of SNPs that essentially captures the haplotype variation within a block. Population simulations suggest that the recursive bisection algorithm for block partitioning is generally reliable for recombination hotspots identification. Maximal entropy theory is applied to choose optimal subset of SNPs. The procedures are evaluated analytically as well as by simulation. The final chapter covers a new software package for genetic marker data analysis. The methods implemented in the package are listed. A brief tutorial is included to illustrate the features of the package. Chapter 5 also describes a new method for estimating population specific F-statistics and an extended algorithm for estimating haplotype frequencies.
APA, Harvard, Vancouver, ISO, and other styles
30

Somasekaram, Premathas. "Designing a Business Intelligence Solution for Analyzing Security Data." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-208685.

Full text
Abstract:
Business Intelligence is a set of tools and applications that are widely deployed across major corporations today. An appropriate translation for “Business Intelligence” in Swedish is “beslutstöd”, and it clearly describes the purpose of such a solution, andt hat is to collect, compress, consolidate and analyze data from multiple sources, so that critical decisions can be made based on it hence the name “beslutstöd. The focusof Business Intelligence has been on business data, so that trends and patterns of sales, marketing, production and other business areas can be studied. In addition, based on the analysis business processes such as production can be optimized or financial data can be consolidated efficiently. These are only a few areas to mention where business intelligence provides considerable support to decision-makings. However, there is also a certain complexity associated with implementing a Business Intelligence solution. That means the implementation and operations costs can only be justified when critical business data is analyzed. This implies other important areas such as security, which are usually not evaluated. Nevertheless, security should in fact be considered important for companies, organizations and all those that deal with research, development, and innovations, which are the keys for those entities to continue to exist and thrive. On the other hand, research, development, and innovation might be just the sources that could attract intrusion attempts and other malicious activities in order to steal valuable data thus it is equally important to secure sensitive data. The purpose of this study is to show how Business Intelligence can be used to analyze certain security data, so that it can then be used to detect and identify potential threats, intrusion attempts, weak points, peculiar patterns, and highlight security weak spots. This essentially means Business Intelligence can be an efficient tool to protect invaluable intellectual properties of a company. Furthermore, security analysis becomes even more important when considering the rapid development in the technological field, and one good example of this is the introduction of so-called smart devices that are capable of handling a number of tasks automatically. Smart devices such as smart TV or mobile phone offer a variety of new features and in the process, they use an increased number of hardware and software components that produce volumes of data. Consequently, all these may introduce new vulnerabilities, which in turn emphasize the importance of using applications like Business Intelligence to identify security holes and potential threats, and react proactively.
APA, Harvard, Vancouver, ISO, and other styles
31

Le, Hai-Son Phuoc. "Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/245.

Full text
Abstract:
Advances in genomics allow researchers to measure the complete set of transcripts in cells. These transcripts include messenger RNAs (which encode for proteins) and microRNAs, short RNAs that play an important regulatory role in cellular networks. While this data is a great resource for reconstructing the activity of networks in cells, it also presents several computational challenges. These challenges include the data collection stage which often results in incomplete and noisy measurement, developing methods to integrate several experiments within and across species, and designing methods that can use this data to map the interactions and networks that are activated in specific conditions. Novel and efficient algorithms are required to successfully address these challenges. In this thesis, we present probabilistic models to address the set of challenges associated with expression data. First, we present a novel probabilistic error correction method for RNA-Seq reads. RNA-Seq generates large and comprehensive datasets that have revolutionized our ability to accurately recover the set of transcripts in cells. However, sequencing reads inevitably contain errors, which affect all downstream analyses. To address these problems, we develop an efficient hidden Markov modelbased error correction method for RNA-Seq data . Second, for the analysis of expression data across species, we develop clustering and distance function learning methods for querying large expression databases. The methods use a Dirichlet Process Mixture Model with latent matchings and infer soft assignments between genes in two species to allow comparison and clustering across species. Third, we introduce new probabilistic models to integrate expression and interaction data in order to predict targets and networks regulated by microRNAs. Combined, the methods developed in this thesis provide a solution to the pipeline of expression analysis used by experimentalists when performing expression experiments.
APA, Harvard, Vancouver, ISO, and other styles
32

RODRIGUES, LIVIA COUTO RUBACK. "ENRICHING AND ANALYZING SEMANTIC TRAJECTORIES WITH LINKED OPEN DATA." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2017. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=33109@1.

Full text
Abstract:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
Os últimos anos testemunharam o uso crescente de dispositivos que rastreiam objetos móveis: equipamentos com GPS e telefones móveis, veículos ou outros sensores da Internet das Coisas, além de dados de localização de check-ins de redes sociais. Estes dados de mobilidade são representados como trajetórias, e armazenam a sequência de posições de um objeto móvel. Porém, estas sequências representam somente os dados de posição originais, que precisam ser semanticamente enriquecidos para permitir tarefas de análise e apoiar um entendimento profundo sobre o comportamento do movimento. Um outro espaço de dados global sem precedentes tem crescido rapidamente, a Web de Dados, graças à iniciativa de Dados Interligados. Estes dados semânticos ricos e livremente disponíveis fornecem uma nova maneira de enriquecer dados de trajetória. Esta tese apresenta contribuições para os desafios que surgem considerando este cenário. Em primeiro lugar, a tese investiga como dados de trajetória podem se beneficiar da iniciativa de dados interligados, guiando todo o processo de enriquecimento semântico utilizando fontes de dados externas. Em segundo lugar, aborda o tópico de computação de similaridade entre entidades representadas como dados interligados com o objetivo de computar a similaridade entre trajetórias semanticamente enriquecidas. A novidade da abordagem apresentada nesta tese consiste em considerar as características relevantes das entidades como listas ranqueadas. Por último, a tese aborda a computação da similaridade entre trajetórias enriquecidas comparando a similaridade entre todas as entidades representadas como dados interligados que representam as trajetórias enriquecidas.
The last years witnessed a growing number of devices that track moving objects: personal GPS equipped devices and GSM mobile phones, vehicles or other sensors from the Internet of Things but also the location data deriving from the Social Networks check-ins. These mobility data are represented as trajectories, recording the sequence of locations of the moving object. However, these sequences only represent the raw location data and they need to be semantically enriched to be meaningful in the analysis tasks and to support a deep understanding of the movement behavior. Another unprecedented global space that is also growing at a fast pace is the Web of Data, thanks to the emergence of the Linked Data initiative. These freely available semantic rich datasets provide a novel way to enhance trajectory data. This thesis presents a contribution to the many challenges that arise from this scenario. First, it investigates how trajectory data may benefit from the Linked Data Initiative by guiding the whole trajectory enrichment process with the use of external datasets. Then, it addresses the pivotal topic of the similarity computation between Linked Data entities with the final objective of computing the similarity between semantically enriched trajectories. The novelty of our approach is that the thesis considers the relevant entity features as a ranked list. Finally, the thesis targets the computation of the similarity between enriched trajectories by comparing the similarity of the Linked Data entities that represent the enriched trajectories.
APA, Harvard, Vancouver, ISO, and other styles
33

Suslov, E., O. Nozhenko, and A. Mostovych. "Strain gauge measurement data analyzing for flat wheel detection." Thesis, Національний авіаційний університет, 2017. http://er.nau.edu.ua/handle/NAU/32947.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Xi, Nuo. "A Composite Likelihood Approach for Factor Analyzing Ordinal Data." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1306331305.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Rylander, Max, and Filip Hultgren. "Application failure predictions from neural networks analyzing telemetry data." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-451340.

Full text
Abstract:
ith the revolution of the internet, new applications have emerged in our daily life. People are dependent on services for transportation, bank matters, and communication. Services availability is crucial for their survival and competition against other service providers. Achieving good availability is a challenging task. The latest trend is migrating systems to the cloud. The cloud provides numerous methods to prevent downtimes, such as auto-scaling, continuous deployment, continuous monitoring, and more. However, failures can still occur even though the preemptive techniques fulfill their purpose. Monitoring the system gives insights into the system's actual state, but it is up to the maintainer to interpret these insights. This thesis investigates how machine learning can predict future crashes of Kubernetes pods based on the metrics collected from them. At the start of the project, there was no available data on pod crashes, and the solution was to simulate a 10-tier microservice system in a Kubernetes cluster to create generic data. The project applies two different models, a Random Forest model and a Temporal Convolutional Networksmodel, where the first-mentioned acted as a baseline model. They predict if a failure will occur within a given prediction time window based upon a 15-minutes of data. The project evaluated three different prediction time windows. The five-minute prediction time window resulted in the best foresight based on the models' accuracy. The Random Forest model achieved an accuracy of 73.4 %, while the TCN model achieved an accuracy of 77.7 %. Predictions of the models can act as an early alert of incoming failure, which the system or a maintainer can act upon to improve the availability of its system.
APA, Harvard, Vancouver, ISO, and other styles
36

Antonelli, Joseph. "Statistical Methods for Analyzing Complex Spatial and Missing Data." Thesis, Harvard University, 2015. http://nrs.harvard.edu/urn-3:HUL.InstRepos:26718722.

Full text
Abstract:
In chapter 1, we develop a novel two-dimensional wavelet decomposition to decompose spatial surfaces into different frequencies without imposing any restrictions on the form of the spatial surface. We illustrate the effectiveness of the proposed decomposition on satellite based PM2.5 data, which is available on a 1km by 1km grid across Massachusetts. We then apply our proposed decomposition to study how different frequencies of the PM2.5 surface adversely impact birth weights in Massachusetts. In chapter 2, we study the impact of monitor locations on two stage health effect studies in air pollution epidemiology. Typically in these studies, estimates of air pollution exposure are obtained from a first stage model that utilizes monitoring data, and then a second stage outcome model is fit using this estimated exposure. The location of the monitoring sites is usually not random and their locations can drastically impact inference in health effect studies. We take an in-depth look at the specific case where the location of monitors depends on the locations of the subjects in the second stage model and show that inference can be greatly improved in this setting relative to completely random allocation of monitors. In chapter 3, we introduce a Bayesian data augmentation method to control for confounding in large administrative databases when additional data is available on confounders in a validation study. Large administrative databases are becoming increasingly available, and they have the power to address many questions that we otherwise couldn't answer. Most of these databases, while large in size, do not have sufficient information on confounders to validly estimate causal effects. However, in many cases a smaller, validation data set is available with a richer set of confounders. We propose a method that uses information from the validation data to impute missing confounders in the main data and select only those confounders which are necessary for confounding adjustment. We illustrate the effectiveness of our method in a simulation study, and analyze the effect of surgical resection on 30 day survival in brain tumor patients from Medicare.
Biostatistics
APA, Harvard, Vancouver, ISO, and other styles
37

Grigsby, Jason D. "Analyzing and improving initial data for binary black holes." Winston-Salem, NC : Wake Forest University, 2009. http://dspace.zsr.wfu.edu/jspui/handle/10339/44664.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Wilhelm, Gary L. "Analyzing and sharing data for surface combat weapons systems." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2004. http://library.nps.navy.mil/uhtbin/hyperion/04Dec%5FWilhelm.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Mathias, Henry. "Analyzing Small Businesses' Adoption of Big Data Security Analytics." ScholarWorks, 2019. https://scholarworks.waldenu.edu/dissertations/6614.

Full text
Abstract:
Despite the increased cost of data breaches due to advanced, persistent threats from malicious sources, the adoption of big data security analytics among U.S. small businesses has been slow. Anchored in a diffusion of innovation theory, the purpose of this correlational study was to examine ways to increase the adoption of big data security analytics among small businesses in the United States by examining the relationship between small business leaders' perceptions of big data security analytics and their adoption. The research questions were developed to determine how to increase the adoption of big data security analytics, which can be measured as a function of the user's perceived attributes of innovation represented by the independent variables: relative advantage, compatibility, complexity, observability, and trialability. The study included a cross-sectional survey distributed online to a convenience sample of 165 small businesses. Pearson correlations and multiple linear regression were used to statistically understand relationships between variables. There were no significant positive correlations between relative advantage, compatibility, and the dependent variable adoption; however, there were significant negative correlations between complexity, trialability, and the adoption. There was also a significant positive correlation between observability and the adoption. The implications for positive social change include an increase in knowledge, skill sets, and jobs for employees and increased confidentiality, integrity, and availability of systems and data for small businesses. Social benefits include improved decision making for small businesses and increased secure transactions between systems by detecting and eliminating advanced, persistent threats.
APA, Harvard, Vancouver, ISO, and other styles
40

Harris, Lateasha Monique. "Perceptions of Teachers about Using and Analyzing Data to Inform Instruction." ScholarWorks, 2018. https://scholarworks.waldenu.edu/dissertations/5469.

Full text
Abstract:
Monitoring academic progress to guide instructional practices is an important role of teachers in a small rural school district in the Southern United States. Teachers in this region were experiencing difficulties using the approved school district model to implement data-driven instruction. The purpose of this qualitative case study was to identify elementary- and middle-level teachers' perceptions about using the Plan, Do, Study, Act (PDSA) model to analyze data in the classroom and use it to inform classroom instruction. Bambrick-Santoyo's principles for effective data-driven instruction was the conceptual framework that guided this study. The research questions were focused on teachers' perceptions of and experiences with the PDSA. A purposeful sampling was used to recruit 8 teachers from Grades 3-9 and their insights were captured through semistructured interviews, reflective journals, and document analyses of data walls. Emergent themes were identified through an open coding process, and trustworthiness was secured through triangulation and member checking. The themes were about using data to assess students, creating lessons, and collaborating with colleagues. The three findings revealed that elementary- and middle-level teachers acknowledge PDSA as an effective tool for guiding student learning, that teachers rely on assessment data, and that teachers need on-going collaborative engagement with their colleagues when using the PDSA. This study has implications for positive social change by providing a structure for improving classroom instructional practices and engaging teachers in more collaborative practices.
APA, Harvard, Vancouver, ISO, and other styles
41

Sternelöv, Gustav. "Analysis of forklift data – A process for decimating data and analyzing fork positioning functions." Thesis, Linköpings universitet, Statistik och maskininlärning, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139213.

Full text
Abstract:
Investigated in this thesis are the possibilities and effects of reducing CAN data collected from forklifts. The purpose of reducing the data was to create the possibility of exporting and managing data for multiple forklifts and a relatively long period of time. For doing that was an autoregressive filter implemented for filtering and decimating data. Connected to the decimation was also the aim of generating a data set that could be used for analyzing lift sequences and in particular the usage of fork adjustment functions during lift sequences. The findings in the report are that an AR (18) model works well for filtering and decimating the data. Information losses are unavoidable but kept at a relatively low level, and the size of data becomes manageable. Each row in the decimated data is labeled as belonging to a lift sequence or as not belonging to a lift sequence given a manually specified definition of the lift sequence event. From the lift sequences is information about the lift like number of usages of each fork adjustment function, load weight and fork height gathered. The analysis of the lift sequences gave that the lift/lower function on average is used 4.75 times per lift sequence and the reach function 3.23 times on average. For the side shift the mean is 0.35 per lift sequence and for the tilt the mean is 0.10. Moreover, it was also found that the struggling time on average is about 17 % of the total lift sequence time. The proportion of the lift that is struggling time was also shown to differ between drivers, with the lowest mean proportion being 7 % and the highest 30 %.
APA, Harvard, Vancouver, ISO, and other styles
42

Sismanis, Yannis. "Dwarf a complete system for analyzing high-dimensional data sets /." College Park, Md. : University of Maryland, 2004. http://hdl.handle.net/1903/1876.

Full text
Abstract:
Thesis (Ph. D.) -- University of Maryland, College Park, 2004.
Thesis research directed by: Computer Science. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
APA, Harvard, Vancouver, ISO, and other styles
43

Furbush, Mary M. "Analyzing and reporting high school transcript and academic achievement data." Access to citation, abstract and download form provided by ProQuest Information and Learning Company; downloadable PDF file 6.87 Mb., 124 p, 2006. http://proquest.umi.com/pqdlink?did=1176542701&Fmt=7&clientId=79356&RQT=309&VName=PQD.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Kaida, Ning. "Biological insights of transcription factor through analyzing ChIP-Seq data." Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/21733.

Full text
Abstract:
ChIP-Seq is a technology for detecting in vivo transcription factor binding sites or histone modification sites on a genome wide scale. How to utilize the large scale data and find out biological insights is a challenging question for us. Here, we analyzed three ChIP-Seq data sets for human HeLa cell, includ ing data of a transcription factor called STAT1, data of RNA polymerase II (Po12), and data of histone monomethylation (Mel). With these data sets, we looked into the spacial relationship between STAT1 binding sites, Po12 binding sites, Mel flanked regions and the gene transcription start sites; we checked the intersection of locations of STAT1 binding sites, Po12 bind ing sites and Mel flanked regions; we did de novo motif discovery for the sequences around the STAT1 binding sites, and predicted several transcription factors whose binding sites may form cis-regulatory module with STAT1 binding site; we put the STAT1-centered sequences into different categories based on their spacial relationship with Po12 binding sites and Mel flanked regions, and found that the de novo discovered motifs’ occurrence rates are different in sequences of different categories; we also analyzed the ChIP-Seq data along with gene expression data, and found that STAT1 binding may be related with genes’ differential expression under IFN-gamma stimulation. We suggest that further ChIP-Seq experiment be carried out for TFs corresponding to the de novo predicted motifs, and that gene expression be characterized for the IFN-gamma stimulated HeLa cell on the whole genome scale.
APA, Harvard, Vancouver, ISO, and other styles
45

Ning, Kaida. "Biological insights of transcription factor through analyzing ChIP-Seq data." Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/38531.

Full text
Abstract:
ChIP-Seq is a technology for detecting in vivo transcription factor binding sites or histone modification sites on a genome wide scale. How to utilize the large scale data and find out biological insights is a challenging question for us. Here, we analyzed three ChIP-Seq data sets for human HeLa cell, including data of a transcription factor called STAT1, data of RNA polymerase II (Pol2), and data of histone monomethylation (Me1). With these data sets, we looked into the spacial relationship between STAT1 binding sites, Po12 binding sites, Me1 flanked regions and the gene transcription start sites; we checked the intersection of locations of STAT1 binding sites, Pol2 binding sites and Me1 flanked regions; we did de novo motif discovery for the sequences around the STAT1 binding sites, and predicted several transcription factors whose binding sites may form cis-regulatory module with STAT1 binding site; we put the STAT1-centered sequences into different categories based on their spacial relationship with Pol2 binding sites and Me1 flanked regions, and found that the de novo discovered motifs’ occurrence rates are different in sequences of different categories; we also analyzed the ChIP-Seq data along with gene expression data, and found that STAT1 binding may be related with genes’ differential expression under IFN-gamma stimulation. We suggest that further ChIP-Seq experiment be carried out for TFs corresponding to the de novo predicted motifs, and that gene expression be characterized for the IFN-gamma stimulated HeLa cell on the whole genome scale.
APA, Harvard, Vancouver, ISO, and other styles
46

Rader, Kevin Andrew. "Methods for Analyzing Survival and Binary Data in Complex Surveys." Thesis, Harvard University, 2014. http://dissertations.umi.com/gsas.harvard:11619.

Full text
Abstract:
Studies with stratified cluster designs, called complex surveys, have increased in popularity in medical research recently. With the passing of the Affordable Care Act, more information about effectiveness of treatment, cost of treatment, and patient satisfaction may be gleaned from these large complex surveys. We introduce three separate methodological approaches that are useful in complex surveys.
APA, Harvard, Vancouver, ISO, and other styles
47

Yuting, Feng. "Analyzing European National Accounts Data for Detection of anomalous observation." Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-35667.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Wang, Suyi Wang. "Analyzing data with 1D non-linear shapes using topological methods." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1524020976023345.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Hoshaw-Woodard, Stacy. "Large sample methods for analyzing longitudinal data in rehabilitation research /." free to MU campus, to others for purchase, 1999. http://wwwlib.umi.com/cr/mo/fullcit?p9946263.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Pungdumri, Steven Charubhat. "An Interactive Visualization Model for Analyzing Data Storage System Workloads." DigitalCommons@CalPoly, 2012. https://digitalcommons.calpoly.edu/theses/705.

Full text
Abstract:
The performance of hard disks has become increasingly important as the volume of data storage increases. At the bottom level of large-scale storage networks is the hard disk. Despite the importance of hard drives in a storage network, it is often difficult to analyze the performance of hard disks due to the sheer size of the datasets seen by hard disks. Additionally, hard drive workloads can have several multi-dimensional characteristics, such as access time, queue depth and block-address space. The result is that hard drive workloads are extremely diverse and large, making extracting meaningful information from hard drive workloads very difficult. This is one reason why there are several inefficiencies in storage networks. In this paper, we develop a tool that assists in communicating valuable insights into these datasets, resulting in an approach that utilizes parallel coordinates to model data storage workloads captured with bus analyzers. Users are presented with an effective visualization of workload captures with this implementation, along with methods to interact with and manipulate the model in order to more clearly analyze the lowest level of their storage systems. Design decisions regarding the feature set of this tool are based on the analysis needs of domain experts and feedback from a conducted user study. Results from our user study evaluations demonstrate the efficacy of our tool to observe valuable insights, which can potentially assist in future storage system design and deployment decisions. Using this tool, domain experts were able to model storage system datasets with various features to manipulate the visualization to make observations and discoveries, such as detecting logical block address banding and observe various dataset trends which were not readily noticeable using conventional analysis methods.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography