Dissertations / Theses on the topic 'Selection analysis'

To see the other types of publications on this topic, follow the link: Selection analysis.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Selection analysis.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Birch, Jonathan George. "Kin selection : a philosophical analysis." Thesis, University of Cambridge, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.648149.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Mirchandani, Chandru, and Parminder Ghuman. "PERFORMANCE ANALYSIS FOR SYSTEM SELECTION." International Foundation for Telemetering, 2001. http://hdl.handle.net/10150/607568.

Full text
Abstract:
International Telemetering Conference Proceedings / October 22-25, 2001 / Riviera Hotel and Convention Center, Las Vegas, Nevada
The development of unique solutions to telemetry processing using the latest technologies is often fraught with the uncertainty of the system working correctly within the schedule for operational support. This uncertainty can be reduced considerably by analyzing the performance of the system during the development and incremental test stage. This paper describes a method by which the analysis may be carried out during development so that the system will have the capability in the required time for mission support. This paper will show how different system models lend themselves to the requirements, and how the analyses identifies areas of high risk. The paper will also describe a case study whereby these three alternatives to telemetry processing have been used and could have been analyzed so that they would have met the requirements in a timely manner.
APA, Harvard, Vancouver, ISO, and other styles
3

Hiles, James F. "Multi-phase source selection strategy analysis." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2000. http://handle.dtic.mil/100.2/ADA386722.

Full text
Abstract:
Thesis (M.S. in Management) Naval Postgraduate School, Dec. 2000.
"December 2000." Thesis advisor(s): Jeffrey Cuskey, Keith Snider. Includes bibliographical references (p. 111-114). Also available online.
APA, Harvard, Vancouver, ISO, and other styles
4

Hare, Brian K. Dinakarpandian Deendayal. "Feature selection in DNA microarray analysis." Diss., UMK access, 2004.

Find full text
Abstract:
Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2004.
"A thesis in computer science." Typescript. Advisor: D. Dinakarpandian. Vita. Title from "catalog record" of the print edition Description based on contents viewed Feb. 24, 2006. Includes bibliographical references (leaves 81-86 ). Online version of the print edition.
APA, Harvard, Vancouver, ISO, and other styles
5

Dimitrakopoulou, Vasiliki. "Bayesian variable selection in cluster analysis." Thesis, University of Kent, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.594195.

Full text
Abstract:
Statistical analysis of data sets of high-dimensionality has met great interest over the past years, with great applications on disciplines such as medicine, nellascience, pattern recognition, image analysis and many others. The vast number of available variables though, contrary to the limited sample size, often mask the cluster structure of the data. It is often that some variables do not help in distinguishing the different clusters in the data; patterns over the samp•.l ed observations are, thus, usually confined to a small subset of variables. We are therefore interested in identifying the variables that best discriminate the sample, simultaneously to recovering the actual cluster structure of the objects under study. With the Markov Chain Monte Carlo methodology being widely established, we investigate the performance of the combined tasks of variable selection and clustering procedure within the Bayesian framework. Motivated by the work of Tadesse et al. (2005), we identify the set of discriminating variables with the use of a latent vector and form the clustering procedure within the finite mixture models methodology. Using Markov chains we draw inference on, not just the set of selected variables and the cluster allocations, but also on the actual number of components: using the f:teversible Jump MCMC sampler (Green, 1995) and a variation of t he SAMS sampler of Dahl (2005). However, sensitivity t o the hyperparameters settings of the covariance structure of the suggested model motivated our interest in an Empirical Bayes procedure to pre-specify the crucial hyper parameters. Further on addressing the problem of II ~----. -- 1 hyperparameters' sensitivity, we suggest several different covariance structures for the mixture components. Developing MATLAB codes for all models introduced in this thesis, we apply and compare the various models suggested on a set of simulated data, as well as on three real data sets; the iris, the crabs and the arthritis data sets.
APA, Harvard, Vancouver, ISO, and other styles
6

Bastola, Jatan, Kenneth E. Findley, and Nathan T. Woodward. "Analysis of contract source selection strategy." Thesis, Monterey, California: Naval Postgraduate School, 2015. http://hdl.handle.net/10945/45810.

Full text
Abstract:
Approved for public release; distribution is unlimited
The Department of Defense (DOD) spends billions acquiring weapons systems, supplies, and services. The contract management process has to be executed diligently to ensure the government is receiving the highest return on investment. The process has six steps, two of which relate to the source selection strategy: solicitation planning and source selection. Once the acquisition team determines whether to use a lowest price technically acceptable (LPTA) or Tradeoff source selection strategy, they evaluate proposals to determine which offer presents the best value to the government. The purpose of this research is to explore potential relationships between the source selection strategy (LPTA or Tradeoff) and resultant contract outcomes. This research uses data collected from contract files and related documentation from two major systems commands (Naval Air Systems Command and Naval Sea Systems Command) to show the implication of the LPTA and Tradeoff source selection strategies. The findings suggest that an LPTA source selection strategy has a significantly shorter lead-time to contract award. The findings should be viewed with caution, however, as the sample size consisted of only six LPTA contracts. This report concludes with two recommendations to improve further research on choosing a source selection strategy and contract outcomes.
APA, Harvard, Vancouver, ISO, and other styles
7

Ni, Xuelei. "New results in detection, estimation, and model selection." Available online, Georgia Institute of Technology, 2006, 2006. http://etd.gatech.edu/theses/available/etd-12042005-190654/.

Full text
Abstract:
Thesis (Ph. D.)--Industrial and Systems Engineering, Georgia Institute of Technology, 2006.
Xiaoming Huo, Committee Chair ; C. F. Jeff Wu, Committee Member ; Brani Vidakovic, Committee Member ; Liang Peng, Committee Member ; Ming Yuan, Committee Member.
APA, Harvard, Vancouver, ISO, and other styles
8

Garrad, Mark, and n/a. "Computer Aided Text Analysis in Personnel Selection." Griffith University. School of Applied Psychology, 2004. http://www4.gu.edu.au:8080/adt-root/public/adt-QGU20040408.093133.

Full text
Abstract:
This program of research was aimed at investigating a novel application of computer aided text analysis (CATA). To date, CATA has been used in a wide variety of disciplines, including Psychology, but never in the area of personnel selection. Traditional personnel selection techniques have met with limited success in the prediction of costly training failures for some occupational groups such as pilot and air traffic controller. Accordingly, the overall purpose of this thesis was to assess the validity of linguistic style to select personnel. Several studies were used to examine the structure of language in a personnel selection setting; the relationship between linguistic style and the individual differences dimensions of ability, personality and vocational interests; the validity of linguistic style as a personnel selection tool and the differences in linguistic style across occupational groups. The participants for the studies contained in this thesis consisted of a group of 810 Royal Australian Air Force Pilot, Air Traffic Control and Air Defence Officer trainees. The results partially supported two of the eight hypotheses; the other six hypotheses were supported. The structure of the linguistic style measure was found to be different in this study compared with the structure found in previous research. Linguistic style was found to be unrelated to ability or vocational interests, although some overlap was found between linguistic style and the measure of personality. In terms of personnel selection validity, linguistic style was found to relate to the outcome of training for the occupations of Pilot, Air Traffic Control and Air Defence Officer. Linguistic style also demonstrated incremental validity beyond traditional ability and selection interview measures. The findings are discussed in light of the Five Factor Theory of Personality, and motivational theory and a modified spreading activation network model of semantic memory and knowledge. A general conclusion is drawn that the analysis of linguistic style is a promising new tool in the area of personnel selection.
APA, Harvard, Vancouver, ISO, and other styles
9

Stark, J. Alex. "Statistical model selection techniques for data analysis." Thesis, University of Cambridge, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.390190.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

ALBUQUERQUE, LUIZ FERNANDO FERNANDES DE. "ONLINE ALGORITHMS ANALYSIS FOR SPONSORED LINKS SELECTION." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2009. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=16088@1.

Full text
Abstract:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
Links patrocinados são aqueles que aparecem em destaque nos resultados de pesquisas em máquinas de busca na Internet e são grande fonte de receita para seus provedores. Para os anunciantes, que fazem ofertas por palavras-chave para aparecerem em destaque nas consultas dos usuários, são uma oportunidade de divulgação da marca, conquista e manutenção de clientes. Um dos desafios das máquinas de busca neste modelo de negócio é selecionar os anunciantes que serão exibidos a cada consulta de modo a maximizar sua receita em determinado período. Este é um problema tipicamente online, onde a cada consulta é tomada uma decisão sem o conhecimento prévio das próximas consultas. Após uma decisão ser tomada, esta não pode mais ser alterada. Nesta dissertação avaliamos experimentalmente algoritmos propostos na literatura para solução deste problema, comparando-os à solução ótima offline, em simulações com dados sintéticos. Supondo que o conjunto das consultas diárias obedeça a uma determinada distribuição, propomos dois algoritmos baseados em informações estocásticas que são avaliados nos mesmos cenários que os outros algoritmos.
Sponsored links are those that appear highlighted at Internet search engine results. They are responsible for a large amount of their providers’ revenue. To advertisers, that place bids for keywords in large auctions at Internet, these links are the opportunity of brand exposing and achieving more clients. To search engine companies, one of the main challenges in this business model is selecting which advertisers should be allocated to each new query to maximize their total revenue in the end of the day. This is a typical online problem, where for each query is taken a decision without previous knowledge of future queries. Once the decision is taken, it can not be modified anymore. In this work, using synthetically generated data, we do experimental evaluation of three algorithms proposed in the literature for this problem and compare their results with the optimal offline solution. Considering that daily query set obeys some well known distribution, we propose two algorithms based on stochastic information, those are evaluated in the same scenarios of the others.
APA, Harvard, Vancouver, ISO, and other styles
11

Li, Lingbo. "Exact analysis for requirements selection and optimisation." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1575500/.

Full text
Abstract:
Requirements engineering is the prerequisite of software engineering, and plays a crit- ically strategic role in the success of software development. Insufficient management of uncertainty in the requirements engineering process has been recognised as a key reason for software project failure. The essence of uncertainty may arise from partially observable, stochastic environments, or ignorance. To ease the impact of uncertainty in the software development process, it is important to provide techniques that explicitly manage uncertainty in requirements selection and optimisation. This thesis presents a decision support framework to exactly address the uncertainty in requirements selection and optimisation. Three types of uncertainty are managed. They are requirements uncertainty, algorithmic uncertainty, and uncertainty of resource constraints. Firstly, a probabilistic robust optimisation model is introduced to enable the manageability of requirements uncertainty. Requirements uncertainty is probabilis- tically simulated by Monte-Carlo Simulation and then formulated as one of the opti- misation objectives. Secondly, a probabilistic uncertainty analysis and a quantitative analysis sub-framework METRO is designed to cater for requirements selection deci- sion support under uncertainty. An exact Non-dominated Sorting Conflict Graph based Dynamic Programming algorithm lies at the heart of METRO to guarantee the elim- ination of algorithmic uncertainty and the discovery of guaranteed optimal solutions. Consequently, any information loss due to algorithmic uncertainty can be completely avoided. Moreover, a data analytic approach is integrated in METRO to help the deci- sion maker to understand the remaining requirements uncertainty propagation through- out the requirements selection process, and to interpret the analysis results. Finally, a more generic exact multi-objective integrated release and schedule planning approach iRASPA is introduced to holistically manage the uncertainty of resource constraints for requirements selection and optimisation. Software release and schedule plans are inte- grated into a single activity and solved simultaneously. Accordingly, a more advanced globally optimal result can be produced by accommodating and managing the inherent additional uncertainty due to resource constraints as well as that due to requirements. To settle the algorithmic uncertainty problem and guarantee the exactness of results, an ε-constraint Quadratic Programming approach is used in iRASPA.
APA, Harvard, Vancouver, ISO, and other styles
12

Amaya, Kenichi 1973. "Dynamic analysis of equilibrium selection in games." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/17622.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Economics, 2003.
Includes bibliographical references.
Chapter 1 analyzes how pre-play communication and evolution together do or do not lead to socially efficient equilibria in 2 x 2 symmetric coordination games. In our evolutionary dynamics, there are committed players who can choose only one particular action of the base game, as well as those players who can choose message contingent actions, and the evolution in the choice of message is faster than the evolution in actions. We show the Pareto efficient equilibria are selected if and only if the base game satisfies the self-signalling condition, which means that a player has an incentive to convince the opponent that he is going to play the Pareto efficient equilibrium strategy if and only if he is actually planning to play that strategy. Chapter 2 analyzes a stochastic evolutionary dynamics of Kandori-Mailath-Rob (1993) in Spence's job-market signaling model. In contrast to Nldeke and Samuelson's (1997) analysis which showed the Riley equilibrium is selected only if it is undefeated, we show that the Riley equilibrium is always selected. The key which makes this difference is how mutations affect players' behavior. While Noldeke and Samuelson allow a single mutation to change players' actions drastically, we consider a model where players change behavior only slightly if the number of mutations is small. Chapter 3 analyzes pure strategy Markov perfect equilibria in two player asynchronous choice repeated games where the stage game is a 2 x 2 game. We show that Markov perfect equilibrium leads players to behave differently from the static Nash equilibrium in some environments, while in other environment it gives equilibrium selection results.
by Kenichi Amaya.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
13

Smaling, Rudolf M. "System architecture analysis and selection under uncertainty." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/28943.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Engineering Systems Division, 2005.
Includes bibliographical references (leaves 183-191).
A system architecture analysis and selection methodology is presented that builds on the Multidisciplinary Analysis and Optimization framework. It addresses a need and opportunity to extend the MAO techniques to include a means to analyze not only within the technical domain, but also include the ability to evaluate external influences that will act on the system once it is in operation. The nature and extent of these external influences is uncertain and increasingly uncertain for systems with long development timelines and methods for addressing such uncertainty are central to the thesis. The research presented in this document has culminated in a coherent system architecture analysis and selection process addressing this need that consists of several steps: 1. The introduction of the concept of Fuzzy Pareto Optimality. Under uncertainty, one must necessarily consider more than just Pareto Optimal solutions to avoid the unintentional exclusion of viable and possibly even desirable designs. 2. The introduction of a proximity based filtering technique that explicitly links the design and solution spaces. The intent here is preserve diverse designs, even if their resulting performance is similar. 3. Introduction of the concept of Technology Invasiveness through the use of a component Delta Design Structure Matrix (ADSM). The component DSM is used to evaluate the changes in the DSM due to the technology insertion. Based on the quantity and type of these changes a Technology Invasiveness metric is computed. 4. Through the use of utility curves, the technical domain analysis is linked to an analysis of external influence factors.
(cont.) The shape of these curves depends wholly on the external influences that may act on the system once it is commercialized or otherwise put into use. The utility curves, in combination with the (technical) performance distributions, are then used to compute risk and opportunity for each system architecture. System Architecture selection follows from analysis in the technical domain linked to an analysis of external influences and their impact on system architecture potential for success. All of the concepts and the integrated process are developed and assessed in the context of a case which involves the study of a Hydrogen Enhanced Combustion Engine being studied for possible insertion into the vehicle fleet.
by Rudolf M. Smaling.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
14

Beniwal, Baldev K. "Genetic analysis of long-term selection experiments." Thesis, University of Edinburgh, 1991. http://hdl.handle.net/1842/13973.

Full text
Abstract:
A long-term selection experiment with mice for 38 generations was analysed. Genetic parameters were estimated for lean mass, body weight, litter size and other associated traits using univariate and multivariate REML analyses with an animal model fitting litters as an additional random effect. Different combinations of selected and control lines were analysed. The change in genetic parameters during the course of selection and infinitesimal model assumptions were examined. Three replicates, each having high and low selected lines for lean mass at 10 weeks of age were maintained for 20 generations with unselected controls (P-Lines). This resulted in a divergence of 7 phenotypic standard deviation units (igmap) between the high and low lean mass lines. After 20 generations replicates were crossed and the selection criterion was changed to 10 week body weight (P6-Lines) without maintaining the controls. At generation 38 the selected lines diverged by 23.2g (8.7 igmap) for 10 week body weight. The estimates of heritability (h2) and c2 (common litter variance/phenotypic variance ratio) for lean mass were 0.5 and 0.2 from the univariate REML analyses of both control lines alone and of pooled data of control+ high+ low lines. Estimates of c2 were higher in the high lines and lower in the low lines, controls being intermediate. The estimates of genetic paramters for body weight were similar to the lean mass in the P-lines. Analysis of the selected lines indicated a steady decline in their additive genetic variance (va) during the course of selection for both lean mass and body weight, even though allowance was made in the infinitesimal model for reductions in Va due to inbreeding and linkage disequilibrium. The multivariate REML estimates of h2 and c2 for lean mass and body weight were similar to those from the univariate analyses. The genetic (rg) and phenotypic correlations (rp) between lean mass and body weight were very high (> 0.9) and positive. Lean mass also showed positive correlations (rg and rp) with gonadal fat pad weight, but when gonadal fat pad weight was expressed as a proportion of body weight it showed small negative correlations with lean mass.
APA, Harvard, Vancouver, ISO, and other styles
15

Yin, Peng. "Local sensitivity analysis and bias model selection." Thesis, University of Newcastle upon Tyne, 2014. http://hdl.handle.net/10443/2385.

Full text
Abstract:
Incomplete data analysis is often considered with other problems such as model uncertainty or non-identi ability. In this thesis I will use the idea of the local sensitivity analysis to address problems under both ignorable and non-ignorable missing data assumptions. One problem with ignorable missing data is the uncertainty for covariate density. At the mean time, the misspeci cation for the missing data mechanism may happen as well. Incomplete data biases are then caused by di erent sources and we aim to evaluate these biases and interpret them via bias parameters. Under non-ignorable missing data, the bias analysis can also be applied to analyse the di erence from ignorability, and the missing data mechanism misspeci cation will be our primary interest in this case. Monte Carlo sensitivity analysis is proposed and developed to make bias model selection. This method combines the idea of conventional sensitivity analysis and Bayesian sensitivity analysis, with the imputation procedure and the bootstrap method used to simulate the incomplete dataset. The selection of bias models is based on the measure of the observation dataset and the simulated incomplete dataset by using K nearest neighbour distance. We further discuss the non-ignorable missing data problem under a selection model, with our developed sensitivity analysis method used to identify the bias parameters in the missing data mechanism. Finally, we discuss robust con dence intervals in meta-regression models with publication bias and missing confounder.
APA, Harvard, Vancouver, ISO, and other styles
16

Vaughan, Carol E. "A cluster analysis method for materials selection." Thesis, Virginia Tech, 1992. http://hdl.handle.net/10919/41497.

Full text
Abstract:
Materials have typically been selected based on the familiarities and past experiences of a limited number of designers with a limited number of materials. Problems arise when the designer is unfamiliar with new or improved materials, or production processes more efficient and economical than past choices. Proper utilization of complete materials and processing information would require acquisition, understanding, and manipulation of huge amounts of data, including dependencies among variables and "what-if" situations. The problem of materials selection has been addressed with a variety of techniques, from simple broad-based heuristics as guidelines for selection, to elaborate expert system technologies for specific selection situations. However, most materials selection methodologies concentrate only on material properties, leaving other decision criteria with secondary importance. Factors such as component service environment, design features, and feasible manufacturing methods directly influence the material choice, but are seldom addressed in systematic materials selection procedures. This research addresses the problem of developing a systematic materials selection procedure that can be integrated with standard materials data bases. The three-phase methodology developed utilizes a group technology code and cluster analysis method for the selection. The first phase is of go/no go nature, and utilizes the possible service environment requirements of ferromagnetism and chemical corrosion resistance to eliminate materials from candidacy. In the second phase, a cluster analysis is performed on key design and manufacturing attributes captured in a group technology code for remaining materials. The final phase of the methodology is user-driven, in which further analysis of the output of the cluster analysis can be performed for more specific or subjective attributes.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
17

Malone, Gwendolyn Joy. "Ranking and Selection Procedures for Bernoulli and Multinomial Data." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/7603.

Full text
Abstract:
Ranking and Selection procedures have been designed to select the best system from a number of alternatives, where the best system is defined by the given problem. The primary focus of this thesis is on experiments where the data are from simulated systems. In simulation ranking and selection procedures, four classes of comparison problems are typically encountered. We focus on two of them: Bernoulli and multinomial selection. Therefore, we wish to select the best system from a number of simulated alternatives where the best system is defined as either the one with the largest probability of success (Bernoulli selection) or the one with the greatest probability of being the best performer (multinomial selection). We focus on procedures that are sequential and use an indifference-zone formulation wherein the user specifies the smallest practical difference he wishes to detect between the best system and other contenders. We apply fully sequential procedures due to Kim and Nelson (2004) to Bernoulli data for terminating simulations, employing common random numbers. We find that significant savings in total observations can be realized for two to five systems when we wish to detect small differences between competing systems. We also study the multinomial selection problem. We offer a Monte Carlo simulation of the Bechhofer and Kulkarni (1984) MBK multinomial procedure and provide extended tables of results. In addition, we introduce a multi-factor extension of the MBK procedure. This procedure allows for multiple independent factors of interest to be tested simultaneously from one data source (e.g., one person will answer multiple independent surveys) with significant savings in total observations compared to the factors being tested in independent experiments (each survey is run with separate focus groups and results are combined after the experiment). Another multi-factor multinomial procedure is also introduced, which is an extension to the MBG procedure due to Bechhofer and Goldsman (1985, 1986). This procedure performs better that any other procedure to date for the multi-factor multinomial selection problem and should always be used whenever table values for the truncation point are available.
APA, Harvard, Vancouver, ISO, and other styles
18

Youn, Eun Seog. "Feature selection and discriminant analysis in data mining." [Gainesville, Fla.] : University of Florida, 2004. http://purl.fcla.edu/fcla/etd/UFE0004221.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Li, Hao. "Feature cluster selection for high-dimensional data analysis." Diss., Online access via UMI:, 2007.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
20

Podder, Mohua. "Robust genotype classification using dynamic variable selection." Thesis, University of British Columbia, 2008. http://hdl.handle.net/2429/1602.

Full text
Abstract:
Single nucleotide polymorphisms (SNPs) are DNA sequence variations, occurring when a single nucleotide –A, T, C or G – is altered. Arguably, SNPs account for more than 90% of human genetic variation. Dr. Tebbutt's laboratory has developed a highly redundant SNP genotyping assay consisting of multiple probes with signals from multiple channels for a single SNP, based on arrayed primer extension (APEX). The strength of this platform is its unique redundancy having multiple probes for a single SNP. Using this microarray platform, we have developed fully-automated genotype calling algorithms based on linear models for individual probe signals and using dynamic variable selection at the prediction level. The algorithms combine separate analyses based on the multiple probe sets to give a final confidence score for each candidate genotypes. Our proposed classification model achieved an accuracy level of >99.4% with 100% call rate for the SNP genotype data which is comparable with existing genotyping technologies. We discussed the appropriateness of the proposed model related to other existing high-throughput genotype calling algorithms. In this thesis we have explored three new ideas for classification with high dimensional data: (1) ensembles of various sets of predictors with built-in dynamic property; (2) robust classification at the prediction level; and (3) a proper confidence measure for dealing with failed predictor(s). We found that a mixture model for classification provides robustness against outlying values of the explanatory variables. Furthermore, the algorithm chooses among different sets of explanatory variables in a dynamic way, prediction by prediction. We analyzed several data sets, including real and simulated samples to illustrate these features. Our model-based genotype calling algorithm captures the redundancy in the system considering all the underlying probe features of a particular SNP, automatically down-weighting any ‘bad data’ corresponding to image artifacts on the microarray slide or failure of a specific chemistry. Though motivated by this genotyping application, the proposed methodology would apply to other classification problems where the explanatory variables fall naturally into groups or outliers in the explanatory variables require variable selection at the prediction stage for robustness.
APA, Harvard, Vancouver, ISO, and other styles
21

Lee, Wai Hong. "Variable selection for high dimensional transformation model." HKBU Institutional Repository, 2010. http://repository.hkbu.edu.hk/etd_ra/1161.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Lipkovich, Ilya A. "Bayesian Model Averaging and Variable Selection in Multivariate Ecological Models." Diss., Virginia Tech, 2002. http://hdl.handle.net/10919/11045.

Full text
Abstract:
Bayesian Model Averaging (BMA) is a new area in modern applied statistics that provides data analysts with an efficient tool for discovering promising models and obtaining esti-mates of their posterior probabilities via Markov chain Monte Carlo (MCMC). These probabilities can be further used as weights for model averaged predictions and estimates of the parameters of interest. As a result, variance components due to model selection are estimated and accounted for, contrary to the practice of conventional data analysis (such as, for example, stepwise model selection). In addition, variable activation probabilities can be obtained for each variable of interest. This dissertation is aimed at connecting BMA and various ramifications of the multivari-ate technique called Reduced-Rank Regression (RRR). In particular, we are concerned with Canonical Correspondence Analysis (CCA) in ecological applications where the data are represented by a site by species abundance matrix with site-specific covariates. Our goal is to incorporate the multivariate techniques, such as Redundancy Analysis and Ca-nonical Correspondence Analysis into the general machinery of BMA, taking into account such complicating phenomena as outliers and clustering of observations within a single data-analysis strategy. Traditional implementations of model averaging are concerned with selection of variables. We extend the methodology of BMA to selection of subgroups of observations and im-plement several approaches to cluster and outlier analysis in the context of the multivari-ate regression model. The proposed algorithm of cluster analysis can accommodate re-strictions on the resulting partition of observations when some of them form sub-clusters that have to be preserved when larger clusters are formed.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
23

Wirz, Monica. "The practices within leadership selection : a gender analysis." Thesis, University of Cambridge, 2015. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.709337.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Al-Kandari, Noriah Mohammed. "Variable selection and interpretation in principal component analysis." Thesis, University of Aberdeen, 1998. http://digitool.abdn.ac.uk/R?func=search-advanced-go&find_code1=WSN&request1=AAIU067766.

Full text
Abstract:
In many research fields such as medicine, psychology, management and zoology, large numbers of variables are sometimes measured on each individual. As a result, the researcher will end up with a huge data set consisting of large number of variables, say p. Using this collected data set in any statistical analyses may cause several troubles. Thus, many cases demand a prior selection of the best subset of variables of size q, with q « p, to represent the entire data set in any data analysis. Evidently, the best subset of size q for some specified objective can always be determined by investigating systematically all possible subsets of size q, but such a procedure may be computationally difficult especially for large p. Also, in many applications, when a Principal Component Analysis (PCA) is done on a large number of variables, the resultant Principal Components (PCs) may not be easy to interpret. To aid interpretation, it is useful to reduce the number of variables as much as possible whilst capturing most of the variation of the complete data set, X. Thus, this thesis is aimed to reduce the studied number of variables in a given data set by selecting the best q out of p measured variables to highlight the main features of a structured data set as well as aiding the simultaneous interpretation of the first k (covariance or correlation) PCs. This desired aim can be achieved by generating several artificial data sets having different types of structures such as nearly independent variables, highly dependent variables and clustered variables. Then, for each structure, several Variable Selection Criteria (VSC) are applied in order to retain some subsets of size q. The efficiencies of these subsets retained are measured in order to determine the best criteria for retaining subsets of size q. Finally, the general results obtained from the entire artificial data analyses are evaluated on some real data sets having interesting covariance and correlation structures.
APA, Harvard, Vancouver, ISO, and other styles
25

Nunes, Madalena Baioa Paraíso. "Portfolio selection : a study using principal component analysis." Master's thesis, Instituto Superior de Economia e Gestão, 2017. http://hdl.handle.net/10400.5/14598.

Full text
Abstract:
Mestrado em Finanças
Nesta tese aplicámos a análise de componentes principais ao mercado bolsista português usando os constituintes do índice PSI-20, de Julho de 2008 a Dezembro de 2016. Os sete primeiros componentes principais foram retidos, por se ter verificado que estes representavam as maiores fontes de risco deste mercado em específico. Assim, foram construídos sete portfólios principais e comparámo-los com outras estratégias de alocação. Foram construídos o portfólio 1/N (portfólio com investimento igual para cada um dos 26 ativos), o PPEqual (portfólio com igual investimento em cada um dos 7 principal portfólios) e o portfólio MV (portfólio que tem por base a teoria moderna de gestão de carteiras de Markowitz (1952)). Concluímos que estes dois últimos portfólios apresentavam os melhores resultados em termos de risco e retorno, sendo o portfólio PPEqual mais adequado a um investidor com maior grau de aversão ao risco e o portfólio MV mais adequado a um investidor que estaria disposto a arriscar mais em prol de maior retorno. No que diz respeito ao nível de risco, o PPEqual é o portfólio com melhores resultados e nenhum outro portfólio conseguiu apresentar valores semelhantes. Assim encontrámos um portfólio que é a ponderação de todos os portfólios principais por nós construídos e este era o portfólio mais eficiente em termos de risco.
In this thesis we apply principal component analysis to the Portuguese stock market using the constituents of the PSI-20 index from July 2008 to December 2016. The first seven principal components were retained, as we verified that these represented the major risk sources in this specific market. Seven principal portfolios were constructed and we compared them with other allocation strategies. The 1/N portfolio (with an equal investment in each of the 26 stocks), the PPEqual portfolio (with an equal investment in each of the 7 principal portfolios) and the MV portfolio (based on Markowitz's (1952) mean-variance strategy) were constructed. We concluded that these last two portfolios presented the best results in terms of return and risk, with PPEqual portfolio being more suitable for an investor with a greater degree of risk aversion and the MV portfolio more suitable for an investor willing to risk more in favour of higher returns. Regarding the level of risk, PPEqual is the portfolio with the best results and, so far, no other portfolio has presented similar values. Therefore, we found an equally-weighted portfolio among all the principal portfolios we built, which was the most risk efficient.
info:eu-repo/semantics/publishedVersion
APA, Harvard, Vancouver, ISO, and other styles
26

Wong, Kevin Kin Foon. "An efficient sampler for decomposable covariance selection models /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?ISMT%202002%20WONG.

Full text
Abstract:
Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2002.
Includes bibliographical references (leaves 35-36). Also available in electronic version. Access restricted to campus users.
APA, Harvard, Vancouver, ISO, and other styles
27

Hu, Qing. "Predictor Selection in Linear Regression: L1 regularization of a subset of parameters and Comparison of L1 regularization and stepwise selection." Link to electronic thesis, 2007. http://www.wpi.edu/Pubs/ETD/Available/etd-051107-154052/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Corral, Gavin Richard. "Investigating Selection Criteria of Constrained Cluster Analysis: Applications in Forestry." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/78176.

Full text
Abstract:
Forest measurements are inherently spatial. Soil productivity varies spatially at fine scales and tree growth responds by changes in growth-age trajectories. Measuring spatial variability is a perquisite to more effective analysis and statistical testing. In this study, current techniques of partial redundancy analysis and constrained cluster analysis are used to explore how spatial variables determine structure in a managed regular spaced plantation. We will test for spatial relationships in the data and then explore how those spatial relationships are manifested into spatially recognizable structures. The objectives of this research are to measure, test, and map spatial variability in simulated forest plots. Partial redundancy analysis was found to be a good method for detecting the presence or absence of spatial relationships (~95% accuracy). We found that the Calinski-Harabasz method consistently performed better at detecting the correct number of clusters when compared to several other methods. While there is still more work that can be done we believe that constrained cluster analysis has promising applications in forestry and that the Calinski-Harabasz criterion will be most useful.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
29

Lo, Siu-ming, and 盧小皿. "Factor analysis for ranking data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B30162464.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Lo, Siu-ming. "Factor analysis for ranking data /." Hong Kong : University of Hong Kong, 1998. http://sunzi.lib.hku.hk/hkuto/record.jsp?B20792967.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Chootrakool, Hathaikan. "Meta-analysis and sensitivity analysis for selection bias in multi-arm trials." Thesis, University of Newcastle Upon Tyne, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.556140.

Full text
Abstract:
Meta-analysis of multi-arm trials has been used increasingly in recent years, the aims of which are to combine evidence from all possible similar studies and draw inferences about the effectiveness of multiple compared-treatments. Antiplatelet therapy is a pharmacologic therapy which aims to inhibit platelet activation and aggregation in the setting of arterial thrombosis. Throughout the thesis we use binary data from antiplatelet therapy to apply the model and sensitivity analysis. The normal approximation model using empirical logistic transform has been employed to compare different treatments in multi-arm trials, allowing studies of both direct and indirect comparisons. The issue of direct-indirect comparison is studied in detail, borrowing the strength from the indirect comparisons and making inferences about appropriately chosen parameters. Additionally, a hierarchical structure of the model addresses the problem of heterogeneity among different studies. However the model requires a large sample size of each individual study. When the sample size is small, an exact logistic regression model is introduced. Both unconditional and conditional maximum likelihood approaches are performed to make inferences for the logistic regression model. We use Gaussian-Hermite quadrature to approximate the integral involved in the likelihood functions. Both approaches have been examined to different cases in the simulation study. Studies with statistically significant results (positive results) are potentially more likely to be submitted or selected more rapidly than studies with non-significant results (negative results). This leads to false-positive results or an incorrect, usually over-optimistic, conclusion, a problem known as selection bias in the meta-analysis. A funnel plot is a graphical tool which is used to detect selection bias in this research. We apply the idea of a sensitivity analysis by defining a selection model to the available data of a meta-analysis, by allowing different amounts of selection bias in the model and investigate how sensitive the main interest parameter is when compared to the estimates of the standard model. We also examine the sensitivity analysis by the simulation study.
APA, Harvard, Vancouver, ISO, and other styles
32

Bunea, Florentina. "A model selection approach to partially linear regression /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/8971.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Loscalzo, Steven. "Group based techniques for stable feature selection." Diss., Online access via UMI:, 2009.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
34

Kastner, Thomas M. "A sequential multinomial selection procedure with elimination." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/27977.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Wen, Songqiao. "Dimension reduction and variable selection in regression." HKBU Institutional Repository, 2008. http://repository.hkbu.edu.hk/etd_ra/914.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Tezcaner, Diclehan. "Multi-objective Route Selection." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/2/12610767/index.pdf.

Full text
Abstract:
In this thesis, we address the route selection problem for Unmanned Air Vehicles (UAV) under multiple objectives. We consider a general case for this problem where the UAV has to visit several targets and return to the base. For this case, there are multiple combinatorial problems to be considered. First, the paths to be followed between any pairs of targets should be determined. This part can be considered as a multi-objective shortest path problem. Additionally, we need to determine the order of the targets to be visited. This in turn, is a multi-objective traveling salesperson problem. The overall problem is a combination of these two combinatorial problems. The route selection for UAVs has been studied by several researchers, mainly in the military context. They considered a linear combination of the two objectives
minimizing distance traveled and minimizing radar detection threat
and proposed heuristics for the minimization of the composite single objective problem. We treat these two objectives separately. We develop an evolutionary algorithm to determine the efficient tours. We also consider an exact interactive approach to identify the best paths and tours of a decision maker. We tested the two solution approaches on both small-sized and large-sized problem instances.
APA, Harvard, Vancouver, ISO, and other styles
37

Hu, Jiaqun. "An Empirical Comparison of Different Approaches in Portfolio Selection." Thesis, Uppsala universitet, Analys och tillämpad matematik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-174962.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

King, John Douglas. "Deep Web Collection Selection." Queensland University of Technology, 2004. http://eprints.qut.edu.au/15992/.

Full text
Abstract:
The deep web contains a massive number of collections that are mostly invisible to search engines. These collections often contain high-quality, structured information that cannot be crawled using traditional methods. An important problem is selecting which of these collections to search. Automatic collection selection methods try to solve this problem by suggesting the best subset of deep web collections to search based on a query. A few methods for deep Web collection selection have proposed in Collection Retrieval Inference Network system and Glossary of Servers Server system. The drawback in these methods is that they require communication between the search broker and the collections, and need metadata about each collection. This thesis compares three different sampling methods that do not require communication with the broker or metadata about each collection. It also transforms some traditional information retrieval based techniques to this area. In addition, the thesis tests these techniques using INEX collection for total 18 collections (including 12232 XML documents) and total 36 queries. The experiment shows that the performance of sample-based technique is satisfactory in average.
APA, Harvard, Vancouver, ISO, and other styles
39

Delaere, Ian. "The chemistry of Vivia sativa L. selection." Title page, contents and abstract only, 1996. http://web4.library.adelaide.edu.au/theses/09PH/09phd332.pdf.

Full text
Abstract:
Bibliography: leaves 151-166. This thesis describes the development of two novel and complementary analytical approaches for assaying cyanoalanine non-protein amino acids. These assays are used to determine the distribution of these compounds both within and between plants and to identify accessions of common vetch which contain low levels of the cyanoalanine non-protein amino acids in germplasm collections. These analytical tools are used to correlate toxicity observed in animal feeding experiments with the cyanoalanine content. This thesis covers also the first report of the use of diffuse reflectance using dispersive infrared spectrometry for the "in situ" quantification of specific organic components from plant tissue as well as the first use of micellar electrokinetic chromatography for the quantitative analysis of 9-fluorenylmethyl chloroformate (FMOC) derivatised and non-derivatised components of extracts from plant material.
APA, Harvard, Vancouver, ISO, and other styles
40

Pesudovs, Konrad, J. M. Burr, Clare Harley, and David B. Elliott. "The development, assessment, and selection of questionnaires." Lippincott Williams & Wilkins for the American Academy of Optometry, 2007. http://hdl.handle.net/10454/4513.

Full text
Abstract:
No
Patient-reported outcome measurement has become accepted as an important component of comprehensive outcomes research. Researchers wishing to use a patient-reported measure must either develop their own questionnaire (called an instrument in the research literature) or choose from the myriad of instruments previously reported. This article summarizes how previously developed instruments are best assessed using a systematic process and we propose a system of quality assessment so that clinicians and researchers can determine whether there exists an appropriately developed and validated instrument that matches their particular needs. These quality assessment criteria may also be useful to guide new instrument development and refinement. We welcome debate over the appropriateness of these criteria as this will lead to the evolution of better quality assessment criteria and in turn better assessment of patient-reported outcomes.
APA, Harvard, Vancouver, ISO, and other styles
41

Rafiee, Mohammad Mohsen. "Model Selection and Uniqueness Analysis for Reservoir History Matching." Doctoral thesis, Technische Universitaet Bergakademie Freiberg Universitaetsbibliothek "Georgius Agricola", 2011. http://nbn-resolving.de/urn:nbn:de:bsz:105-qucosa-65509.

Full text
Abstract:
“History matching” (model calibration, parameter identification) is an established method for determination of representative reservoir properties such as permeability, porosity, relative permeability and fault transmissibility from a measured production history; however the uniqueness of selected model is always a challenge in a successful history matching. Up to now, the uniqueness of history matching results in practice can be assessed only after individual and technical experience and/or by repeating history matching with different reservoir models (different sets of parameters as the starting guess). The present study has been used the stochastical theory of Kullback & Leibler (K-L) and its further development by Akaike (AIC) for the first time to solve the uniqueness problem in reservoir engineering. In addition - based on the AIC principle and the principle of parsimony - a penalty term for OF has been empirically formulated regarding geoscientific and technical considerations. Finally a new formulation (Penalized Objective Function, POF) has been developed for model selection in reservoir history matching and has been tested successfully in a North German gas field
„History Matching“ (Modell-Kalibrierung, Parameter Identifikation) ist eine bewährte Methode zur Bestimmung repräsentativer Reservoireigenschaften, wie Permeabilität, Porosität, relative Permeabilitätsfunktionen und Störungs-Transmissibilitäten aus einer gemessenen Produktionsgeschichte (history). Bis heute kann die Eindeutigkeit der identifizierten Parameter in der Praxis nicht konstruktiv nachgewiesen werden. Die Resultate eines History-Match können nur nach individueller Erfahrung und/oder durch vielmalige History-Match-Versuche mit verschiedenen Reservoirmodellen (verschiedenen Parametersätzen als Startposition) auf ihre Eindeutigkeit bewertet werden. Die vorliegende Studie hat die im Reservoir Engineering erstmals eingesetzte stochastische Theorie von Kullback & Leibler (K-L) und ihre Weiterentwicklung nach Akaike (AIC) als Basis für die Bewertung des Eindeutigkeitsproblems genutzt. Schließlich wurde das AIC-Prinzip als empirischer Strafterm aus geowissenschaftlichen und technischen Überlegungen formuliert. Der neu formulierte Strafterm (Penalized Objective Function, POF) wurde für das History Matching eines norddeutschen Erdgasfeldes erfolgreich getestet
APA, Harvard, Vancouver, ISO, and other styles
42

Jonen, Benjamin Philipp. "An Empirical Analysis of Paper Selection by Digital Printers." Thesis, Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/16180.

Full text
Abstract:
The Printing Industry is undergoing a Digital Revolution . The importance of digital printing has been increasing substantially over the last decade. How has this development affected the paper selection of printing firms? Only paper suppliers who successfully anticipate the changing needs of the printing firms will be able to benefit from the industry trend. This paper employs a probability model to analyze a survey data set of 103 digital printing firms in the USA and Canada. The research idea is to link the firm s paper selection with the firm s characteristics in order to gain insights into the printing firm s paper purchase behavior and the overall industry structure. The first part of this work investigates the importance of certain paper aspects, such as price, runnability and print quality. Strikingly, a company s involvement in digital printing, measured by the percent of digital printers of the total number of printers in the firm, is a central determinant of the importance of all paper aspects analyzed. This finding underscores the tremendous importance of the printing firms transition to digital printing for the Paper Industry. Paper runnability is found to become more important the faster the firm grows and can be explained by the fact that more successful firms incur higher opportunity costs from downtime. Another key finding is that the importance of paper price is lower for firms who collaborate with their customer on the paper selection and are able to pass on cost increases in the paper price. The second part involves a more direct assessment of paper selection. Here, the firm s characteristics are utilized to explain the choice of coated versus uncoated paper for the printing job. The analysis shows that firms involved in sophisticated print services, such as Digital Asset Management or Variable Data Printing are more likely to use the high quality coated paper. Further it is found that the usage of coated paper increases with catalog printing whereas it decreases with book and manual printing.
APA, Harvard, Vancouver, ISO, and other styles
43

Yu, Chi Wai. "Median loss analysis and its application to model selection." Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/7110.

Full text
Abstract:
In this thesis, we propose a median-loss-based procedure for inference. The optimal estimators under this criterion often have desirable properties. For instance, they have good resistance to outliers and are resistant to the specific loss used to form them. In the Bayesian framework, we establish the asymptotics of median-loss-based Bayes estimators. It turns out that the median-based Bayes estimator has a root-n rate of convergence and is asymptotically normal. We also give a simple way to compute this Bayesian estimator. In regression problems, we compare the median-based Bayes estimator with two other estimators. One is the Frequentist version of our median-loss-minimizing estimator, which is exactly the least median of squares (LMS) estimator, and the other one is two-sided least trimmed squares (LTS) estimator. This comparison is natural because the LMS estimator is median-based but only has cubic-root-n convergence, while the 2-sided LTS is not median-based but it has root-n convergence. We show that our median-based Bayes estimator is a good tradeoff between the LMS and 2-sided LTS estimators. For model selection problems, we propose a median analog of the usual cross validation procedure. In the context of linear models, we present simulation result to compare the performance of cross validation (CV) and median cross validation (MCV). Our results show that when the error terms are from a heavy tailed distribution or are from the normal distribution with small values of the unknown parameters, MCV works better CV does in terms of the probability that chooses the true model. By contrast, when the error terms are from the normal distribution and the values of the unknown parameters are large, CV outperforms MCV.
APA, Harvard, Vancouver, ISO, and other styles
44

Querel, Richard Robert, and University of Lethbridge Faculty of Arts and Science. "IRMA calibrations and data analysis for telescope site selection." Thesis, Lethbridge, Alta. : University of Lethbridge, Faculty of Arts and Science, 2007, 2007. http://hdl.handle.net/10133/675.

Full text
Abstract:
Our group has developed a 20 μm passive atmospheric water vapour monitor. The Infrared Radiometer for Millimetre Astronomy (IRMA) has been commissioned and deployed for site testing for the Thirty Meter Telescope (TMT) and the Giant Magellan Telescope (GMT). Measuring precipitable water vapour (PWV) requires both a sophisticated atmospheric model (BTRAM) and an instrument (IRMA). Atmospheric models depend on atmospheric profiles. Most profiles are generic in nature, representing only a latitude in some cases. Site-specific atmospheric profiles are required to accurately simulate the atmosphere above any location on Earth. These profiles can be created from publicly available archives of radiosonde data, that offer nearly global coverage. Having created a site-specific profile and model, it is necessary to determine the PWV sensitivity to the input parameter uncertainties used in the model. The instrument must also be properly calibrated. In this thesis, I describe the radiometric calibration of the IRMA instrument, and the creation and analysis of site-specific atmospheric models for use with the IRMA instrument in its capacity as an atmospheric water vapour monitor for site testing.
xii, 135 leaves : ill. ; 28 cm. --
APA, Harvard, Vancouver, ISO, and other styles
45

François, Damien. "High-dimensional data analysis : optimal metrics and feature selection." Université catholique de Louvain, 2007. http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-01152007-162739/.

Full text
Abstract:
High-dimensional data are everywhere: texts, sounds, spectra, images, etc. are described by thousands of attributes. However, many data analysis tools at disposal (coming from statistics, artificial intelligence, etc.) were designed for low-dimensional data. Many of the explicit or implicit assumptions made while developing the classical data analysis tools are not transposable to high-dimensional data. For instance, many tools rely on the Euclidean distance, to compare data elements. But the Euclidean distance concentrates in high-dimensional spaces: all distances between data elements seem identical. The Euclidean distance is furthermore incapable of identifying important attributes from irrelevant ones. This thesis therefore focuses the choice of a relevant distance function to compare high-dimensional data and the selection of the relevant attributes. In Part One of the thesis, the phenomenon of the concentration of the distances is considered, and its consequences on data analysis tools are studied. It is shown that for nearest neighbours search, the Euclidean distance and the Gaussian kernel, both heavily used, may not be appropriate; it is thus proposed to use Fractional metrics and Generalised Gaussian kernels. Part Two of this thesis focuses on the problem of feature selection in the case of a large number of initial features. Two methods are proposed to (1) reduce the computational burden of feature selection process and (2) cope with the instability induced by high correlation between features that often appear with high-dimensional data. Most of the concepts studied and presented in this thesis are illustrated on chemometric data, and more particularly on spectral data, with the objective of inferring a physical or chemical property of a material by analysis the spectrum of the light it reflects.
APA, Harvard, Vancouver, ISO, and other styles
46

Shen, Lin. "GIS-based Multi-criteria Analysis for Aquaculture Site Selection." Thesis, University of Gävle, Department of Industrial Development, IT and Land Management, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-7532.

Full text
Abstract:

The pearl oyster Pinctada martensii or Pinctada fucata is the oyster for produce the South China Sea Pearl, and the production of pearl oyster Pinctada martensii plays a key role for the economic and social welfare of the coastal areas. To guarantee both rich and sustainability of providing pearl oyster productions, addressing the suitable areas for aquaculture is a very important consideration in any aquaculture activities. Relatively rarely, in the case of site selection research, the researchers use GIS analysis to identify suitable sites in fishery industry in China. Therefore, I decided to help the local government to search suitable sites form the view of GIS context. This study was conducted to find the optimal sites for suspended culture of pearl oyster Pinctada martensii using GIS-based multi-criteria analysis. The original idea came from the research of Radiarta and his colleagues in 2008 in Japan. Most of the parameters in the GIS model were extracted from remote sensing data (Moderate Resolution Imaging Spectroradiometer and Landsat 7). Eleven thematic layers were arranged into three sub-models, namely: biophysical model, social-economic model and constraint model. The biophysical model includes sea surface temperature, chlorophyll-α concentration, suspended sediment concentration and bathymetry. The criteria in the social-economic model are distance to cities and towns and distance to piers. The constraint model was used to exclude the places from the research area where the natural conditions cannot be fulfilled for the development of pearl oyster aquaculture; it contains river mouth, tourism area, harbor, salt fields / shrimp ponds, and non-related water area. Finally those GIS sub-models were used to address the optimal sites for pearl oyster Pinctada martensii culture by using weighted linear combination evaluation. In the final result, suitability levels were arranged from 1 (least suitable) to 8 (most suitable), and about 2.4% of the total potential area had the higher levels (level 6 and 7). These areas were considered to be the places that have the most suitable conditions for pearl oyster Pinctada martensii for costal water of Yingpan.

APA, Harvard, Vancouver, ISO, and other styles
47

Beatty, Rodger James. "Unison Canadian choral compositions, selection and analysis for schools." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape7/PQDD_0007/NQ41058.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Barbera, Maria Antonia Petes Thomas D. "Selection and analysis of mitotic crossovers in Saccharomyces cerevisiae." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2007. http://dc.lib.unc.edu/u?/etd,959.

Full text
Abstract:
Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2007.
Title from electronic title page (viewed Dec. 18, 2007). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Curriculum in Genetics and Molecular Biology." Discipline: Genetics and Molecular Biology; Department/School: Medicine.
APA, Harvard, Vancouver, ISO, and other styles
49

Schmitz, David. "Automated Service-Oriented Impact Analysis and Recovery Alternative Selection." Diss., kostenfrei, 2008. http://edoc.ub.uni-muenchen.de/8999/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Bhatt, Samir. "Statistical analysis of natural selection in RNA virus populations." Thesis, University of Oxford, 2010. http://ora.ox.ac.uk/objects/uuid:64341c38-f09e-48ed-84e8-7ab9f171a753.

Full text
Abstract:
A key goal of modern evolutionary biology is the identification of genes or genome regions that have been targeted by natural selection. Methods for detecting natural selection utilise the information sampled in contemporary gene sequences and test for deviation from the null hypothesis of neutrality. One such method is the McDonald Kreitman test (MK test), which detects the the molecular 'footprint' left by natural selection by considering the frequency of observed mutations within the sampled population. In this thesis I investigate the applicability of the MK test to viral populations and develop several new methods based on the original MK test. In chapter 2, I use a combination of simulation and methodological improvements to show that the MK test can have low error when applied to analysis of RNA virus populations. Then, in chapter 3, I develop an extension of the MK test with the purpose of estimating rates of adaptive fixation for all genes of the human influenza A virus subtypes H1N1 and H3N2. My results are consistent with previous studies on selection in influenza virus populations, and provide a new perspective on the evolutionary dynamics of human influenza virus. In chapter 4 I develop a formal statistical framework based, on the MK test, for calculating the number of non neutral sites at any frequency range in the site frequency spectrum. In this framework, I introduce a new method for reconstructing the site frequency spectrum that incorporates sampling error and allows for the inclusion of prior knowledge. Using this new framework I show that the majority of nucleotide sites in hepatitis C virus sequences sampled during chronic infection represent deleterious mutations. Finally, in chapter 5 I use the generalised framework introduced in chapter 4 to develop a statistic for evaluating the deleterious mutation load of a population. I apply this test sequences that represent 96 RNA virus genes and show that my approach has comparable power to equivalent phylogenetic methods. In this thesis I have developed computationally efficient methods for analysis of genetic data from virus populations. It is my hope that these methods will become useful given the explosion in sequence data that has accompanied recent improvements in sequencing technology.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography