Dissertations / Theses: 'Partial least squares analysis'

1

Moller, Jurgen Johann. "The implementation of noise addition partial least squares." Thesis, Stellenbosch : University of Stellenbosch, 2009. http://hdl.handle.net/10019.1/3362.

Full text

Abstract:

Thesis (MComm (Statistics and Actuarial Science))--University of Stellenbosch, 2009.
When determining the chemical composition of a specimen, traditional laboratory techniques are often both expensive and time consuming. It is therefore preferable to employ more cost effective spectroscopic techniques such as near infrared (NIR). Traditionally, the calibration problem has been solved by means of multiple linear regression to specify the model between X and Y. Traditional regression techniques, however, quickly fail when using spectroscopic data, as the number of wavelengths can easily be several hundred, often exceeding the number of chemical samples. This scenario, together with the high level of collinearity between wavelengths, will necessarily lead to singularity problems when calculating the regression coefficients. Ways of dealing with the collinearity problem include principal component regression (PCR), ridge regression (RR) and PLS regression. Both PCR and RR require a significant amount of computation when the number of variables is large. PLS overcomes the collinearity problem in a similar way as PCR, by modelling both the chemical and spectral data as functions of common latent variables. The quality of the employed reference method greatly impacts the coefficients of the regression model and therefore, the quality of its predictions. With both X and Y subject to random error, the quality the predictions of Y will be reduced with an increase in the level of noise. Previously conducted research focussed mainly on the effects of noise in X. This paper focuses on a method proposed by Dardenne and Fernández Pierna, called Noise Addition Partial Least Squares (NAPLS) that attempts to deal with the problem of poor reference values. Some aspects of the theory behind PCR, PLS and model selection is discussed. This is then followed by a discussion of the NAPLS algorithm. Both PLS and NAPLS are implemented on various datasets that arise in practice, in order to determine cases where NAPLS will be beneficial over conventional PLS. For each dataset, specific attention is given to the analysis of outliers, influential values and the linearity between X and Y, using graphical techniques. Lastly, the performance of the NAPLS algorithm is evaluated for various

APA, Harvard, Vancouver, ISO, and other styles

2

Krämer, Nicole. "Analysis of high dimensional data with partial least squares and boosting." [S.l.] : [s.n.], 2006. http://opus.kobv.de/tuberlin/volltexte/2007/1484.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Li, Siqing. "Kernel-based least-squares approximations: theories and applications." HKBU Institutional Repository, 2018. https://repository.hkbu.edu.hk/etd_oa/539.

Full text

Abstract:

Kernel-based meshless methods for approximating functions and solutions of partial differential equations have many applications in engineering fields. As only scattered data are used, meshless methods using radial basis functions can be extended to complicated geometry and high-dimensional problems. In this thesis, kernel-based least-squares methods will be used to solve several direct and inverse problems. In chapter 2, we consider discrete least-squares methods using radial basis functions. A general l^2-Tikhonov regularization with W_2^m-penalty is considered. We provide error estimates that are comparable to kernel-based interpolation in cases in which the function being approximated is within and is outside of the native space of the kernel. These results are extended to the case of noisy data. Numerical demonstrations are provided to verify the theoretical results. In chapter 3, we apply kernel-based collocation methods to elliptic problems with mixed boundary conditions. We propose some weighted least-squares formulations with different weights for the Dirichlet and Neumann boundary collocation terms. Besides fill distance of discrete sets, our weights also depend on three other factors: proportion of the measures of the Dirichlet and Neumann boundaries, dimensionless volume ratios of the boundary and domain, and kernel smoothness. We determine the dependencies of these terms in weights by different numerical tests. Our least-squares formulations can be proved to be convergent at the H^2 (Ω) norm. Numerical experiments in two and three dimensions show that we can obtain desired convergent results under different boundary conditions and different domain shapes. In chapter 4, we use a kernel-based least-squares method to solve ill-posed Cauchy problems for elliptic partial differential equations. We construct stable methods for these inverse problems. Numerical approximations to solutions of elliptic Cauchy problems are formulated as solutions of nonlinear least-squares problems with quadratic inequality constraints. A convergence analysis with respect to noise levels and fill distances of data points is provided, from which a Tikhonov regularization strategy is obtained. A nonlinear algorithm is proposed to obtain stable solutions of the resulting nonlinear problems. Numerical experiments are provided to verify our convergence results. In the final chapter, we apply meshless methods to the Gierer-Meinhardt activator-inhibitor model. Pattern transitions in irregular domains of the Gierer-Meinhardt model are shown. We propose various parameter settings for different patterns appearing in nature and test these settings on some irregular domains. To further simulate patterns in reality, we construct different kinds of domains and apply proposed parameter settings on different patches of domains found in nature.

APA, Harvard, Vancouver, ISO, and other styles

4

Zhou, Yue. "Analysis of Additive Risk Model with High Dimensional Covariates Using Partial Least Squares." Digital Archive @ GSU, 2006. http://digitalarchive.gsu.edu/math_theses/6.

Full text

Abstract:

In this thesis, we consider the problem of constructing an additive risk model based on the right censored survival data to predict the survival times of the cancer patients, especially when the dimension of the covariates is much larger than the sample size. For microarray Gene Expression data, the number of gene expression levels is far greater than the number of samples. Such ¡°small n, large p¡± problems have attracted researchers to investigate the association between cancer patient survival times and gene expression profiles for recent few years. We apply Partial Least Squares to reduce the dimension of the covariates and get the corresponding latent variables (components), and these components are used as new regressors to fit the extensional additive risk model. Also we employ the time dependent AUC curve (area under the Receiver Operating Characteristic (ROC) curve) to assess how well the model predicts the survival time. Finally, this approach is illustrated by re-analysis of the well known AML data set and breast cancer data set. The results show that the model fits both of the data sets very well.

APA, Harvard, Vancouver, ISO, and other styles

5

Skoglund, Ingegerd. "Algorithms for a Partially Regularized Least Squares Problem." Licentiate thesis, Linköping : Linköpings universitet, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-8784.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Yue, Weiping Biotechnology &amp Biomolecular Sciences Faculty of Science UNSW. "Predicting the citation impact of clinical neurology journals using structural equation modeling with partial least squares." Awarded by:University of New South Wales. School of Biotechnology and Biomolecular Sciences, 2004. http://handle.unsw.edu.au/1959.4/20821.

Full text

Abstract:

The ongoing debate on the evaluative role of citation analysis and the theory of citation recognizes that the citation process is complex and that citation counts are affected by certain extra-scientific or external factors. To date, little effort has been made to explore the effects of various external factors; this thesis addresses this lack. In the context of the various perspectives on citations and citation analysis, this study uses journals as the unit of analysis and investigates what, how, and to what extent extra-scientific factors influence the citation impact of journals. An integrated conceptual model of Journal Citation Impact that takes into account current theoretical positions and prior empirical research findings is developed. It addresses the interrelationships between Journal Citation Impact and a range of external factors (Journal Properties, Journal Visibility, Journal Accessibility, Journal Internationality, Journal Selectivity, Journal Promptness, Journal Editorial Prestige, and Perceived Journal Quality). The proposed conceptual model is novel in that it: (1) incorporates nearly all possible external factors that affect Journal Citation Impact; (2) addresses the complex interrelationships between a number of external factors and Journal Citation Impact in one model; (3) regards both Journal Citation Impact and its external factors as theoretical constructs; and (4) identifies the observed variables of the external factors and Journal Citation Impact. However, because of the difficulties in operationalizing all the theoretical constructs, this conceptual model is simplified to an operational model for empirical testing. The operational model includes the construct Journal Citation Impact and four of its external factors, Journal Properties, Journal Accessibility, Journal Internationality, and Perceived Journal Quality. Structural Equation Modeling (SEM) with Partial Least Squares (PLS) is used to test the operational model with empirical data from 41 research journals in clinical neurology. Data are collected from bibliographic database searching, web searching, printed journals, and from a web-based survey that was conducted to obtain information on perceptions of journal quality. Empirical results of the operational model show that Journal Accessibility, Journal Internationality, and Perceived Journal Quality have large, medium, and small effects respectively on Journal Citation Impact, thus indicating that certain extra-scientific factors can influence Journal Citation Impact significantly. The findings suggest that great care should be taken in interpreting and evaluating the results obtained from citation analysis. In terms of Journal Citation Impact, this research also suggests that various journal citation indicators should be ii used to reflect different aspects of citation impact. By exploring the phenomenological domain in the citing process, this exploratory study not only provides a better understanding of citation analysis, it also contributes to the development of the theory of citation. From the methodological perspective, introducing SEM with PLS to Informetrics and Scientometrics also contributes to the knowledge base of these fields. Pragmatically, the research findings will enhance the judgment of researchers and practitioners such as editors, publishers, librarians and other information specialists in assessing journal performance. Finally, the worldwide survey findings on peer assessment of journal outlets in clinical neurology will be useful for researchers, academics or clinicians in this field.

APA, Harvard, Vancouver, ISO, and other styles

7

Patten, Kyle. "An analysis of the modeling used to determine customer satisfaction." Thesis, Kansas State University, 2014. http://hdl.handle.net/2097/35765.

Full text

Abstract:

Master of Agribusiness
Department of Agricultural Economics
Kevin Dhuyvetter
Many companies use surveys to establish customer satisfaction metrics. This OEM has been using surveys to analyze customer satisfaction with their products, services, and distribution channel for several decades. Satisfaction metrics are established for the brand, product, and channel partners. The product metric is derived from a question on the survey asking customers how satisfied they are with the product. There are subsequent questions thereafter inquiring about satisfaction with specific functional areas of the product. It is common practice to use Partial Least Squares (PLS) regression analysis to evaluate what impacts the functional area questions have on the overall satisfaction question. The model results are used to understand what areas of the machine should be focused on to improve customers’ experiences with the machine. These results are compared to other data sources such as warranty, field reports, customer focus groups, etc. The results from these models are sometimes questioned based on what common intuition would suggest. Typically the top three drivers to the product metric are understandable, but there are often one or two key areas that do not make logical sense. The objective of this thesis was to understand whether PLS modeling is appropriate given the nature of customer survey data. Models were estimated using existing survey data on a specific model in the tractor product line. PLS models assume data are linear with no bounds. This in itself likely makes this type of model inappropriate for analyzing customer survey data. Responses are bounded on an 11 point scale from 0-10, however, the PLS model being non-bounded assumes there can be a score under 0 or over 10. The model also assumes a linear slope that would indicate each covariate answer 0-10 has the same level of effect on the response variable. This research has found that each covariate answer is in fact non-linear. For example, a customer answering a 2 to quality of manufacturing workmanship has a different impact on the overall satisfaction score than a customer who answers 8. Finally, this research discovered that the PLS models produce negative coefficients of significant value that are not reported to the enterprise. Binary and ordered logistic (logit) models were estimated as an alternative to PLS. Logistic models are non-linear and are commonly used to evaluate bounded data. Response data were separated into two groups based on Net Promoter Score (NPS) Methodology (Reicheld 2006). Using the NPS methodology, 0-6 scores are considered detractors, 7-8 scores are considered passives, and 9-10 scores are considered promoters. The logistic models demonstrate that the top two drivers to customer satisfaction scores are still quality of manufacturing workmanship and reliability/operational availability (similar to results of the PLS model). The unresolved problems question on the survey was included in the models and demonstrated that the predicted probability of a customer being a promoter is much higher in both binary and ordered logit models if no unresolved problems exist. Finally, the model found engine oil consumption remained negative and is statistically significant suggesting that even with the alternative modeling approach there still may be data issues related to the survey. It is recommended that the OEM implement logistic modeling for analyzing customer survey data. It is also recommended that a new survey design be constructed to eliminate issues with correlated data that can lead to spurious and unexplainable results.

APA, Harvard, Vancouver, ISO, and other styles

8

Nguyen, Nga. "Multivariate analysis and GIS in generating vulnerability map of acid sulfate soils." Thesis, KTH, Mark- och vattenteknik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-170472.

Full text

Abstract:

The study employed multi-variate methods to generate vulnerability maps for acid sulfate soils (AS) in the Norrbotten county of Sweden. In this study, the relationships between the reclassified datasets and each biogeochemical element was carefully evaluated with ANOVA Kruskal Wallis and PLS analysis. The sta-tistical results of ANOVA Kruskall-Wallis provided us a useful knowledge of the relationships of the preliminary vulnerability ranks in the classified datasets ver-sus the amount of each biogeochemical element. Then, the statistical knowledge and expert knowledge were used to generate the final vulnerability ranks of AS soils in the classified datasets which were the input independent variables in PLS analyses. The results of Kruskal-Wallis one way ANOVA and PLS analyses showed a strong correlation of the higher levels total Cu2+, Ni2+ and S to the higher vulnerability ranks in the classified datasets. Hence, total Cu2+, Ni2+ and S were chosen as the dependent variables for further PLS analyses. In particular, the Variable Importance in the Projection (VIP) value of each classified dataset was standardized to generate its weight. Vulnerability map of AS soil was a result of a lineal combination of the standardized values in the classified dataset and its weight. Seven weight sets were formed from either uni-variate or multi-variate PLS analyses. Accuracy tests were done by testing the classification of measured pH values of 74 soil profiles with different vulnerability maps and evaluating the areas that were not the AS soil within the groups of medium to high AS soil probability in the land-cover and soil-type datasets. In comparison to the other weight sets, the weight set of multi-variate PLS analysis of the matrix of total Ni2+& S or total Cu2+& S had the robust predictive performance. Sensitivity anal-ysis was done in the weight set of total Ni2+& S, and the results of sensitivity analyses showed that the availability of ditches, and the change in the terrain sur-faces, the altitude level, and the slope had a high influence to the vulnerability map of AS soils. The study showed that using multivariate analysis was a very good approach methodology for predicting the probability of acid sulfate soil.

APA, Harvard, Vancouver, ISO, and other styles

9

Sinioja, Tim. ""Source characterization of soils contaminated with Polycyclic Aromatic Compounds (PACs) by use of Partial Least Squares Discriminant Analysis (PLS-DA)"." Thesis, Örebro universitet, Institutionen för naturvetenskap och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-64627.

Full text

Abstract:

Polycyclic aromatic compounds (PACs) are organic compounds that include several sub-groups of toxic, persistent and carcinogenic environmental pollutants consisting of two or more non-substituted or substituted aromatic rings. Due to the complexity of PAC-mixtures found in the environment it can be challenging and time-consuming to track the sources of contamination. In the present study, multivariate data analysis (MVDA) models, such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were applied to track sources of PACs at contaminated sites. Based on the chemical profile of 78 PACs obtained in GC-MS analysis of soils, 26 observations were classified according to their petrogenic, pyrogenic or urban background soil origin. Two soil samples of unknown origin collected at a contaminated site in Mjölby, Sweden, were successfully fitted to the validated PLS-DA model and their origins were determined as petrogenic. The study shows that validated PLS-DA models can be applied to predict the petrogenic, pyrogenic and urban background soil origins of samples collected at PAC contaminated sites, thus to track the sources of contamination. It is also concluded that 16 U.S. Environmental Protection Agency’s (EPA) priority polycyclic aromatic hydrocarbons (PAHs) are not sufficient to predict the origin of contamination with PCA or PLS-DA.

APA, Harvard, Vancouver, ISO, and other styles

10

Hassling, Andreas, and Simon Flink. "SYSTEM IDENTIFICATION OF A WASTE-FIRED CFB BOILER : Using Principal Component Analysis (PCA) and Partial Least Squares Regression modeling (PLS-R)." Thesis, Mälardalens högskola, Akademin för ekonomi, samhälle och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-34979.

Full text

Abstract:

Heat and electricity production along with waste management are two modern day challenges for society. One of the possible solution to both of them is the incineration of household waste to produce heat and electricity. Incineration is a waste-to-energy treatment process, which can reduce the need for landfills and save the use of more valuable fuels, thereby conserving natural resources. This report/paper investigates the performance and emissions of a municipal solid waste (MSW) fueled industrial boiler by performing a system identification analysis using Principle Component Analysis (PCA) and Partial Least Squares Regression (PLS-R) modeling. The boiler is located in Västerås, Sweden and has a maximum capacity of 167MW. It produces heat and electricity for the city of Västerås and is operated by Mälarenergi AB. A dataset containing 148 different boilers variables, measured with a one hour interval over 2 years, was used for the system identification analysis. The dataset was visually inspected to remove obvious outliers before beginning the analysis using a multivariate data analysis software called The Unscrambler X (Version 10.3, CAMO Software, Norway). Correlations found using PCA was taken in account during the PLSR modelling where models were created for one response each. Some variables had an unexpected impact on the models while others were fully logical regarding combustion theory. Results found during the system analysis process are regarded as reliable. Any errors may be due to outlier data points and model inadequacies.

APA, Harvard, Vancouver, ISO, and other styles

11

Bringmann, Philipp. "Adaptive least-squares finite element method with optimal convergence rates." Doctoral thesis, Humboldt-Universität zu Berlin, 2021. http://dx.doi.org/10.18452/22350.

Full text

Abstract:

Die Least-Squares Finite-Elemente-Methoden (LSFEMn) basieren auf der Minimierung des Least-Squares-Funktionals, das aus quadrierten Normen der Residuen eines Systems von partiellen Differentialgleichungen erster Ordnung besteht. Dieses Funktional liefert einen a posteriori Fehlerschätzer und ermöglicht die adaptive Verfeinerung des zugrundeliegenden Netzes. Aus zwei Gründen versagen die gängigen Methoden zum Beweis optimaler Konvergenzraten, wie sie in Carstensen, Feischl, Page und Praetorius (Comp. Math. Appl., 67(6), 2014) zusammengefasst werden. Erstens scheinen fehlende Vorfaktoren proportional zur Netzweite den Beweis einer schrittweisen Reduktion der Least-Squares-Schätzerterme zu verhindern. Zweitens kontrolliert das Least-Squares-Funktional den Fehler der Fluss- beziehungsweise Spannungsvariablen in der H(div)-Norm, wodurch ein Datenapproximationsfehler der rechten Seite f auftritt. Diese Schwierigkeiten führten zu einem zweifachen Paradigmenwechsel in der Konvergenzanalyse adaptiver LSFEMn in Carstensen und Park (SIAM J. Numer. Anal., 53(1), 2015) für das 2D-Poisson-Modellproblem mit Diskretisierung niedrigster Ordnung und homogenen Dirichlet-Randdaten. Ein neuartiger expliziter residuenbasierter Fehlerschätzer ermöglicht den Beweis der Reduktionseigenschaft. Durch separiertes Markieren im adaptiven Algorithmus wird zudem der Datenapproximationsfehler reduziert. Die vorliegende Arbeit verallgemeinert diese Techniken auf die drei linearen Modellprobleme das Poisson-Problem, die Stokes-Gleichungen und das lineare Elastizitätsproblem. Die Axiome der Adaptivität mit separiertem Markieren nach Carstensen und Rabus (SIAM J. Numer. Anal., 55(6), 2017) werden in drei Raumdimensionen nachgewiesen. Die Analysis umfasst Diskretisierungen mit beliebigem Polynomgrad sowie inhomogene Dirichlet- und Neumann-Randbedingungen. Abschließend bestätigen numerische Experimente mit dem h-adaptiven Algorithmus die theoretisch bewiesenen optimalen Konvergenzraten.
The least-squares finite element methods (LSFEMs) base on the minimisation of the least-squares functional consisting of the squared norms of the residuals of first-order systems of partial differential equations. This functional provides a reliable and efficient built-in a posteriori error estimator and allows for adaptive mesh-refinement. The established convergence analysis with rates for adaptive algorithms, as summarised in the axiomatic framework by Carstensen, Feischl, Page, and Praetorius (Comp. Math. Appl., 67(6), 2014), fails for two reasons. First, the least-squares estimator lacks prefactors in terms of the mesh-size, what seemingly prevents a reduction under mesh-refinement. Second, the first-order divergence LSFEMs measure the flux or stress errors in the H(div) norm and, thus, involve a data resolution error of the right-hand side f. These difficulties led to a twofold paradigm shift in the convergence analysis with rates for adaptive LSFEMs in Carstensen and Park (SIAM J. Numer. Anal., 53(1), 2015) for the lowest-order discretisation of the 2D Poisson model problem with homogeneous Dirichlet boundary conditions. Accordingly, some novel explicit residual-based a posteriori error estimator accomplishes the reduction property. Furthermore, a separate marking strategy in the adaptive algorithm ensures the sufficient data resolution. This thesis presents the generalisation of these techniques to three linear model problems, namely, the Poisson problem, the Stokes equations, and the linear elasticity problem. It verifies the axioms of adaptivity with separate marking by Carstensen and Rabus (SIAM J. Numer. Anal., 55(6), 2017) in three spatial dimensions. The analysis covers discretisations with arbitrary polynomial degree and inhomogeneous Dirichlet and Neumann boundary conditions. Numerical experiments confirm the theoretically proven optimal convergence rates of the h-adaptive algorithm.

APA, Harvard, Vancouver, ISO, and other styles

12

Le, floch Edith. "Méthodes multivariées pour l'analyse jointe de données de neuroimagerie et de génétique." Thesis, Paris 11, 2012. http://www.theses.fr/2012PA112214/document.

Full text

Abstract:

L'imagerie cérébrale connaît un intérêt grandissant, en tant que phénotype intermédiaire, dans la compréhension du chemin complexe qui relie les gènes à un phénotype comportemental ou clinique. Dans ce contexte, un premier objectif est de proposer des méthodes capables d'identifier la part de variabilité génétique qui explique une certaine part de la variabilité observée en neuroimagerie. Les approches univariées classiques ignorent les effets conjoints qui peuvent exister entre plusieurs gènes ou les covariations potentielles entre régions cérébrales.Notre première contribution a été de chercher à améliorer la sensibilité de l'approche univariée en tirant avantage de la nature multivariée des données génétiques, au niveau local. En effet, nous adaptons l'inférence au niveau du cluster en neuroimagerie à des données de polymorphismes d'un seul nucléotide (SNP), en cherchant des clusters 1D de SNPs adjacents associés à un même phénotype d'imagerie. Ensuite, nous prolongeons cette idée et combinons les clusters de voxels avec les clusters de SNPs, en utilisant un test simple au niveau du "cluster 4D", qui détecte conjointement des régions cérébrale et génomique fortement associées. Nous obtenons des résultats préliminaires prometteurs, tant sur données simulées que sur données réelles.Notre deuxième contribution a été d'utiliser des méthodes multivariées exploratoires pour améliorer la puissance de détection des études d'imagerie génétique, en modélisant la nature multivariée potentielle des associations, à plus longue échelle, tant du point de vue de l'imagerie que de la génétique. La régression Partial Least Squares et l'analyse canonique ont été récemment proposées pour l'analyse de données génétiques et transcriptomiques. Nous proposons ici de transposer cette idée à l'analyse de données de génétique et d'imagerie. De plus, nous étudions différentes stratégies de régularisation et de réduction de dimension, combinées avec la PLS ou l'analyse canonique, afin de faire face au phénomène de sur-apprentissage dû aux très grandes dimensions des données. Nous proposons une étude comparative de ces différentes stratégies, sur des données simulées et des données réelles d'IRM fonctionnelle et de SNPs. Le filtrage univarié semble nécessaire. Cependant, c'est la combinaison du filtrage univarié et de la PLS régularisée L1 qui permet de détecter une association généralisable et significative sur les données réelles, ce qui suggère que la découverte d'associations en imagerie génétique nécessite une approche multivariée
Brain imaging is increasingly recognised as an interesting intermediate phenotype to understand the complex path between genetics and behavioural or clinical phenotypes. In this context, a first goal is to propose methods to identify the part of genetic variability that explains some neuroimaging variability. Classical univariate approaches often ignore the potential joint effects that may exist between genes or the potential covariations between brain regions. Our first contribution is to improve the sensitivity of the univariate approach by taking advantage of the multivariate nature of the genetic data in a local way. Indeed, we adapt cluster-inference techniques from neuroimaging to Single Nucleotide Polymorphism (SNP) data, by looking for 1D clusters of adjacent SNPs associated with the same imaging phenotype. Then, we push further the concept of clusters and we combined voxel clusters and SNP clusters, by using a simple 4D cluster test that detects conjointly brain and genome regions with high associations. We obtain promising preliminary results on both simulated and real datasets .Our second contribution is to investigate exploratory multivariate methods to increase the detection power of imaging genetics studies, by accounting for the potential multivariate nature of the associations, at a longer range, on both the imaging and the genetics sides. Recently, Partial Least Squares (PLS) regression or Canonical Correlation Analysis (CCA) have been proposed to analyse genetic and transcriptomic data. Here, we propose to transpose this idea to the genetics vs. imaging context. Moreover, we investigate the use of different strategies of regularisation and dimension reduction techniques combined with PLS or CCA, to face the overfitting issues due to the very high dimensionality of the data. We propose a comparison study of the different strategies on both a simulated dataset and a real fMRI and SNP dataset. Univariate selection appears to be necessary to reduce the dimensionality. However, the generalisable and significant association uncovered on the real dataset by the two-step approach combining univariate filtering and L1-regularised PLS suggests that discovering meaningful imaging genetics associations calls for a multivariate approach

APA, Harvard, Vancouver, ISO, and other styles

13

Johnson, Mikael. "Acoustic Emission in Composite Laminates - Numerical Simulations and Experimental Characterization." Doctoral thesis, KTH, Solid Mechanics, 2002. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3452.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Venaik, Sunil AGSM UNSW. "A Model of Global Marketing in Multinational Firms: An Emprirical Investigation." Awarded by:University of New South Wales. AGSM, 1999. http://handle.unsw.edu.au/1959.4/17479.

Full text

Abstract:

With increasing globalisation of the world economy, there is growing interest in international business research among academics, business practitioners and public policy makers. As marketing is usually the first corporate function to internationalise, it occupies the centre-stage in the international strategy debate. The objective of this study is to understand the environmental and organisational factors that drive the desirable outcomes of learning, innovation and performance in multinational firms. By adapting the IO-based, resource-based and contingency theories, the study proposes the environment-conduct-outcome framework and a model of global marketing in MNCs. Using the structural equation modelling-based PLS methodology, the model is estimated with data from a global survey of marketing managers in MNC subsidiaries. The results show that the traditional international marketing strategy and organisational structure constructs of adaptation and autonomy do not have a significant direct effect on MNC performance. Instead, the effects are largely mediated by the networking, learning and innovation constructs that are included in the proposed model. The study also shows that, whereas collaborative decision making has a positive effect on interunit learning, subsidiary autonomy has a significant influence on innovativeness in MNC subsidiaries. Finally, it is found that marketing mix adaptation has an adverse impact on the performance of MNCs facing high global integration pressures but improves the performance of MNCs confronted with low global integration pressures. The findings have important implications for global marketing in MNCs. First, to enhance organisational learning and innovation and ultimately improve corporate performance, MNCs should simultaneously develop the potentially conflicting organisational attributes of collective decision-making among the subsidiaries and greater autonomy to the subsidiaries. Second, to tap local knowledge, MNCs should increasingly regard their country units as 'colleges' or 'seminaries' of learning rather than merely as 'subsidiaries' with secondary or subordinate roles. Finally, to improve MNC performance, the key requirement is to achieve a good fit between the global organisational structure, marketing strategy and business environment. Overall, the results provide partial support for the IO-based and resource-based views and strong support for the contingency perspective in international strategy.

APA, Harvard, Vancouver, ISO, and other styles

15

Loftus, John. "On the development of control systems technology for fermentation processes." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/on-the-development-of-control-systems-technology-for-fermentation-processes(61955790-a48b-4703-8942-bfe47a38a6c2).html.

Full text

Abstract:

Fermentation processes play an integral role in the manufacture of pharmaceutical products. The Quality by Design initiative, combined with Process Analytical Technologies, aims to facilitate the consistent production of high quality products in the most efficient and economical way. The ability to estimate and control product quality from these processes is essential in achieving this aim. Large historical datasets are commonplace in the pharmaceutical industry and multivariate methods based on PCA and PLS have been successfully used in a wide range of applications to extract useful information from such datasets. This thesis has focused on the development and application of novel multivariate methods to the estimation and control of product quality from a number of processes. The document is divided into four main categories. Firstly, the related literature and inherent mathematical techniques are summarised. Following this, the three main technical areas of work are presented. The first of these relates to the development of a novel method for estimating the quality of products from a proprietary process using PCA. The ability to estimate product quality is useful for identifying production steps that are potentially problematic and also increases process efficiency by ensuring that any defective products are detected before they undergo any further processing. The proposed method is simple and robust and has been applied to two separate case studies, the results of which demonstrate the efficacy of the technique. The second area of work concentrates on the development of a novel method of identifying the operational phases of batch fermentation processes and is based on PCA and associated statistics. Knowledge of the operational phases of a process can be beneficial from a monitoring and control perspective and allows a process to be divided into phases that can be approximated by a linear model. The devised methodology is applied to two separate fermentation processes and results show the capability of the proposed method. The third area of work focuses on undertaking a performance evaluation of two multivariate algorithms, PLS and EPLS, in controlling the end-point product yield of fermentation processes. Control of end-point product quality is of crucial importance in many manufacturing industries, such as the pharmaceutical industry. Developing a controller based on historical and identification process data is attractive due to the simplicity of modelling and the increasing availability of process data. The methodology is applied to two case studies and performance evaluated. From both a prediction and control perspective, it is seen that EPLS outperforms PLS, which is important if modelling data is limited.

APA, Harvard, Vancouver, ISO, and other styles

16

Yoon, Jisu [Verfasser], Tatyana [Akademischer Betreuer] Krivobokova, Stephan [Akademischer Betreuer] Klasen, and Axel [Akademischer Betreuer] Dreher. "Partial Least Squares and Principal Component Analysis with Non-metric Variables for Composite Indices / Jisu Yoon. Gutachter: Tatyana Krivobokova ; Stephan Klasen ; Axel Dreher. Betreuer: Tatyana Krivobokova." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2015. http://d-nb.info/1076160972/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Abrahamsson, Sandra. "Utformning av mjukvarusensorer för avloppsvatten med multivariata analysmetoder." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-207863.

Full text

Abstract:

Varje studie av en verklig process eller ett verkligt system är baserat på mätdata. Förr var den tillgängliga datamängden vid undersökningar ytterst begränsad, men med dagens teknik är mätdata betydligt mer lättillgängligt. Från att tidigare enbart haft få och ofta osammanhängande mätningar för någon enstaka variabel, till att ha många och så gott som kontinuerliga mätningar på ett större antal variabler. Detta förändrar möjligheterna att förstå och beskriva processer avsevärt. Multivariat analys används ofta när stora datamängder med många variabler utvärderas. I det här projektet har de multivariata analysmetoderna PCA (principalkomponentanalys) och PLS (partial least squares projection to latent structures) använts på data över avloppsvatten insamlat på Hammarby Sjöstadsverk. På reningsverken ställs idag allt hårdare krav från samhället för att de ska minska sin miljöpåverkan. Med bland annat bättre processkunskaper kan systemen övervakas och styras så att resursförbrukningen minskas utan att försämra reningsgraden. Vissa variabler är lätta att mäta direkt i vattnet medan andra kräver mer omfattande laboratorieanalyser. Några parametrar i den senare kategorin som är viktiga för reningsgraden är avloppsvattnets innehåll av fosfor och kväve, vilka bland annat kräver resurser i form av kemikalier till fosforfällning och energi till luftning av det biologiska reningssteget. Halterna av dessa ämnen i inkommande vatten varierar under dygnet och är svåra att övervaka. Syftet med den här studien var att undersöka om det är möjligt att utifrån lättmätbara variabler erhålla information om de mer svårmätbara variablerna i avloppsvattnet genom att utnyttja multivariata analysmetoder för att skapa modeller över variablerna. Modellerna kallas ofta för mjukvarusensorer (soft sensors) eftersom de inte utgörs av fysiska sensorer. Mätningar på avloppsvattnet i Linje 1 gjordes under tidsperioden 11 – 15 mars 2013 på flera ställen i processen. Därefter skapades flera multivariata modeller för att försöka förklara de svårmätbara variablerna. Resultatet visar att det går att erhålla information om variablerna med PLS-modeller som bygger på mer lättillgänglig data. De framtagna modellerna fungerade bäst för att förklara inkommande kväve, men för att verkligen säkerställa modellernas riktighet bör ytterligare validering ske.
Studies of real processes are based on measured data. In the past, the amount of available data was very limited. However, with modern technology, the information which is possible to obtain from measurements is more available, which considerably alters the possibility to understand and describe processes. Multivariate analysis is often used when large datasets which contains many variables are evaluated. In this thesis, the multivariate analysis methods PCA (principal component analysis) and PLS (partial least squares projection to latent structures) has been applied to wastewater data collected at Hammarby Sjöstadsverk WWTP (wastewater treatment plant). Wastewater treatment plants are required to monitor and control their systems in order to reduce their environmental impact. With improved knowledge of the processes involved, the impact can be significantly decreased without affecting the plant efficiency. Several variables are easy to measure directly in the water, while other require extensive laboratory analysis. Some of the parameters from the latter category are the contents of phosphorus and nitrogen in the water, both of which are important for the wastewater treatment results. The concentrations of these substances in the inlet water vary during the day and are difficult to monitor properly. The purpose of this study was to investigate whether it is possible, from the more easily measured variables, to obtain information on those which require more extensive analysis. This was done by using multivariate analysis to create models attempting to explain the variation in these variables. The models are commonly referred to as soft sensors, since they don’t actually make use of any physical sensors to measure the relevant variable. Data were collected during the period of March 11 to March 15, 2013 in the wastewater at different stages of the treatment process and a number of multivariate models were created. The result shows that it is possible to obtain information about the variables with PLS models based on easy-to-measure variables. The best created model was the one explaining the concentration of nitrogen in the inlet water.

APA, Harvard, Vancouver, ISO, and other styles

18

Vitale, Raffaele. "Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation." Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/90442.

Full text

Abstract:

The present Ph.D. thesis, primarily conceived to support and reinforce the relation between academic and industrial worlds, was developed in collaboration with Shell Global Solutions (Amsterdam, The Netherlands) in the endeavour of applying and possibly extending well-established latent variable-based approaches (i.e. Principal Component Analysis - PCA - Partial Least Squares regression - PLS - or Partial Least Squares Discriminant Analysis - PLSDA) for complex problem solving not only in the fields of manufacturing troubleshooting and optimisation, but also in the wider environment of multivariate data analysis. To this end, novel efficient algorithmic solutions are proposed throughout all chapters to address very disparate tasks, from calibration transfer in spectroscopy to real-time modelling of streaming flows of data. The manuscript is divided into the following six parts, focused on various topics of interest: Part I - Preface, where an overview of this research work, its main aims and justification is given together with a brief introduction on PCA, PLS and PLSDA; Part II - On kernel-based extensions of PCA, PLS and PLSDA, where the potential of kernel techniques, possibly coupled to specific variants of the recently rediscovered pseudo-sample projection, formulated by the English statistician John C. Gower, is explored and their performance compared to that of more classical methodologies in four different applications scenarios: segmentation of Red-Green-Blue (RGB) images, discrimination of on-/off-specification batch runs, monitoring of batch processes and analysis of mixture designs of experiments; Part III - On the selection of the number of factors in PCA by permutation testing, where an extensive guideline on how to accomplish the selection of PCA components by permutation testing is provided through the comprehensive illustration of an original algorithmic procedure implemented for such a purpose; Part IV - On modelling common and distinctive sources of variability in multi-set data analysis, where several practical aspects of two-block common and distinctive component analysis (carried out by methods like Simultaneous Component Analysis - SCA - DIStinctive and COmmon Simultaneous Component Analysis - DISCO-SCA - Adapted Generalised Singular Value Decomposition - Adapted GSVD - ECO-POWER, Canonical Correlation Analysis - CCA - and 2-block Orthogonal Projections to Latent Structures - O2PLS) are discussed, a new computational strategy for determining the number of common factors underlying two data matrices sharing the same row- or column-dimension is described, and two innovative approaches for calibration transfer between near-infrared spectrometers are presented; Part V - On the on-the-fly processing and modelling of continuous high-dimensional data streams, where a novel software system for rational handling of multi-channel measurements recorded in real time, the On-The-Fly Processing (OTFP) tool, is designed; Part VI - Epilogue, where final conclusions are drawn, future perspectives are delineated, and annexes are included.
La presente tesis doctoral, concebida principalmente para apoyar y reforzar la relación entre la academia y la industria, se desarrolló en colaboración con Shell Global Solutions (Amsterdam, Países Bajos) en el esfuerzo de aplicar y posiblemente extender los enfoques ya consolidados basados en variables latentes (es decir, Análisis de Componentes Principales - PCA - Regresión en Mínimos Cuadrados Parciales - PLS - o PLS discriminante - PLSDA) para la resolución de problemas complejos no sólo en los campos de mejora y optimización de procesos, sino también en el entorno más amplio del análisis de datos multivariados. Con este fin, en todos los capítulos proponemos nuevas soluciones algorítmicas eficientes para abordar tareas dispares, desde la transferencia de calibración en espectroscopia hasta el modelado en tiempo real de flujos de datos. El manuscrito se divide en las seis partes siguientes, centradas en diversos temas de interés: Parte I - Prefacio, donde presentamos un resumen de este trabajo de investigación, damos sus principales objetivos y justificaciones junto con una breve introducción sobre PCA, PLS y PLSDA; Parte II - Sobre las extensiones basadas en kernels de PCA, PLS y PLSDA, donde presentamos el potencial de las técnicas de kernel, eventualmente acopladas a variantes específicas de la recién redescubierta proyección de pseudo-muestras, formulada por el estadista inglés John C. Gower, y comparamos su rendimiento respecto a metodologías más clásicas en cuatro aplicaciones a escenarios diferentes: segmentación de imágenes Rojo-Verde-Azul (RGB), discriminación y monitorización de procesos por lotes y análisis de diseños de experimentos de mezclas; Parte III - Sobre la selección del número de factores en el PCA por pruebas de permutación, donde aportamos una guía extensa sobre cómo conseguir la selección de componentes de PCA mediante pruebas de permutación y una ilustración completa de un procedimiento algorítmico original implementado para tal fin; Parte IV - Sobre la modelización de fuentes de variabilidad común y distintiva en el análisis de datos multi-conjunto, donde discutimos varios aspectos prácticos del análisis de componentes comunes y distintivos de dos bloques de datos (realizado por métodos como el Análisis Simultáneo de Componentes - SCA - Análisis Simultáneo de Componentes Distintivos y Comunes - DISCO-SCA - Descomposición Adaptada Generalizada de Valores Singulares - Adapted GSVD - ECO-POWER, Análisis de Correlaciones Canónicas - CCA - y Proyecciones Ortogonales de 2 conjuntos a Estructuras Latentes - O2PLS). Presentamos a su vez una nueva estrategia computacional para determinar el número de factores comunes subyacentes a dos matrices de datos que comparten la misma dimensión de fila o columna y dos planteamientos novedosos para la transferencia de calibración entre espectrómetros de infrarrojo cercano; Parte V - Sobre el procesamiento y la modelización en tiempo real de flujos de datos de alta dimensión, donde diseñamos la herramienta de Procesamiento en Tiempo Real (OTFP), un nuevo sistema de manejo racional de mediciones multi-canal registradas en tiempo real; Parte VI - Epílogo, donde presentamos las conclusiones finales, delimitamos las perspectivas futuras, e incluimos los anexos.
La present tesi doctoral, concebuda principalment per a recolzar i reforçar la relació entre l'acadèmia i la indústria, es va desenvolupar en col·laboració amb Shell Global Solutions (Amsterdam, Països Baixos) amb l'esforç d'aplicar i possiblement estendre els enfocaments ja consolidats basats en variables latents (és a dir, Anàlisi de Components Principals - PCA - Regressió en Mínims Quadrats Parcials - PLS - o PLS discriminant - PLSDA) per a la resolució de problemes complexos no solament en els camps de la millora i optimització de processos, sinó també en l'entorn més ampli de l'anàlisi de dades multivariades. A aquest efecte, en tots els capítols proposem noves solucions algorítmiques eficients per a abordar tasques dispars, des de la transferència de calibratge en espectroscopia fins al modelatge en temps real de fluxos de dades. El manuscrit es divideix en les sis parts següents, centrades en diversos temes d'interès: Part I - Prefaci, on presentem un resum d'aquest treball de recerca, es donen els seus principals objectius i justificacions juntament amb una breu introducció sobre PCA, PLS i PLSDA; Part II - Sobre les extensions basades en kernels de PCA, PLS i PLSDA, on presentem el potencial de les tècniques de kernel, eventualment acoblades a variants específiques de la recentment redescoberta projecció de pseudo-mostres, formulada per l'estadista anglés John C. Gower, i comparem el seu rendiment respecte a metodologies més clàssiques en quatre aplicacions a escenaris diferents: segmentació d'imatges Roig-Verd-Blau (RGB), discriminació i monitorització de processos per lots i anàlisi de dissenys d'experiments de mescles; Part III - Sobre la selecció del nombre de factors en el PCA per proves de permutació, on aportem una guia extensa sobre com aconseguir la selecció de components de PCA a través de proves de permutació i una il·lustració completa d'un procediment algorítmic original implementat per a la finalitat esmentada; Part IV - Sobre la modelització de fonts de variabilitat comuna i distintiva en l'anàlisi de dades multi-conjunt, on discutim diversos aspectes pràctics de l'anàlisis de components comuns i distintius de dos blocs de dades (realitzat per mètodes com l'Anàlisi Simultània de Components - SCA - Anàlisi Simultània de Components Distintius i Comuns - DISCO-SCA - Descomposició Adaptada Generalitzada en Valors Singulars - Adapted GSVD - ECO-POWER, Anàlisi de Correlacions Canòniques - CCA - i Projeccions Ortogonals de 2 blocs a Estructures Latents - O2PLS). Presentem al mateix temps una nova estratègia computacional per a determinar el nombre de factors comuns subjacents a dues matrius de dades que comparteixen la mateixa dimensió de fila o columna, i dos plantejaments nous per a la transferència de calibratge entre espectròmetres d'infraroig proper; Part V - Sobre el processament i la modelització en temps real de fluxos de dades d'alta dimensió, on dissenyem l'eina de Processament en Temps Real (OTFP), un nou sistema de tractament racional de mesures multi-canal registrades en temps real; Part VI - Epíleg, on presentem les conclusions finals, delimitem les perspectives futures, i incloem annexos.
Vitale, R. (2017). Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/90442
TESIS

APA, Harvard, Vancouver, ISO, and other styles

19

Hennerdal, Aron. "Investigation of multivariate prediction methods for the analysis of biomarker data." Thesis, Linköping University, The Department of Physics, Chemistry and Biology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5889.

Full text

Abstract:

The paper describes predictive modelling of biomarker data stemming from patients suffering from multiple sclerosis. Improvements of multivariate analyses of the data are investigated with the goal of increasing the capability to assign samples to correct subgroups from the data alone.

The effects of different preceding scalings of the data are investigated and combinations of multivariate modelling methods and variable selection methods are evaluated. Attempts at merging the predictive capabilities of the method combinations through voting-procedures are made. A technique for improving the result of PLS-modelling, called bagging, is evaluated.

The best methods of multivariate analysis of the ones tried are found to be Partial least squares (PLS) and Support vector machines (SVM). It is concluded that the scaling have little effect on the prediction performance for most methods. The method combinations have interesting properties – the default variable selections of the multivariate methods are not always the best. Bagging improves performance, but at a high cost. No reasons for drastically changing the work flows of the biomarker data analysis are found, but slight improvements are possible. Further research is needed.

APA, Harvard, Vancouver, ISO, and other styles

20

Durif, Ghislain. "Multivariate analysis of high-throughput sequencing data." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSE1334/document.

Full text

Abstract:

L'analyse statistique de données de séquençage à haut débit (NGS) pose des questions computationnelles concernant la modélisation et l'inférence, en particulier à cause de la grande dimension des données. Le travail de recherche dans ce manuscrit porte sur des méthodes de réductions de dimension hybrides, basées sur des approches de compression (représentation dans un espace de faible dimension) et de sélection de variables. Des développements sont menés concernant la régression "Partial Least Squares" parcimonieuse (supervisée) et les méthodes de factorisation parcimonieuse de matrices (non supervisée). Dans les deux cas, notre objectif sera la reconstruction et la visualisation des données. Nous présenterons une nouvelle approche de type PLS parcimonieuse, basée sur une pénalité adaptative, pour la régression logistique. Cette approche sera utilisée pour des problèmes de prédiction (devenir de patients ou type cellulaire) à partir de l'expression des gènes. La principale problématique sera de prendre en compte la réponse pour écarter les variables non pertinentes. Nous mettrons en avant le lien entre la construction des algorithmes et la fiabilité des résultats.Dans une seconde partie, motivés par des questions relatives à l'analyse de données "single-cell", nous proposons une approche probabiliste pour la factorisation de matrices de comptage, laquelle prend en compte la sur-dispersion et l'amplification des zéros (caractéristiques des données single-cell). Nous développerons une procédure d'estimation basée sur l'inférence variationnelle. Nous introduirons également une procédure de sélection de variables probabiliste basée sur un modèle "spike-and-slab". L'intérêt de notre méthode pour la reconstruction, la visualisation et le clustering de données sera illustré par des simulations et par des résultats préliminaires concernant une analyse de données "single-cell". Toutes les méthodes proposées sont implémentées dans deux packages R: plsgenomics et CMF
The statistical analysis of Next-Generation Sequencing data raises many computational challenges regarding modeling and inference, especially because of the high dimensionality of genomic data. The research work in this manuscript concerns hybrid dimension reduction methods that rely on both compression (representation of the data into a lower dimensional space) and variable selection. Developments are made concerning: the sparse Partial Least Squares (PLS) regression framework for supervised classification, and the sparse matrix factorization framework for unsupervised exploration. In both situations, our main purpose will be to focus on the reconstruction and visualization of the data. First, we will present a new sparse PLS approach, based on an adaptive sparsity-inducing penalty, that is suitable for logistic regression to predict the label of a discrete outcome. For instance, such a method will be used for prediction (fate of patients or specific type of unidentified single cells) based on gene expression profiles. The main issue in such framework is to account for the response to discard irrelevant variables. We will highlight the direct link between the derivation of the algorithms and the reliability of the results. Then, motivated by questions regarding single-cell data analysis, we propose a flexible model-based approach for the factorization of count matrices, that accounts for over-dispersion as well as zero-inflation (both characteristic of single-cell data), for which we derive an estimation procedure based on variational inference. In this scheme, we consider probabilistic variable selection based on a spike-and-slab model suitable for count data. The interest of our procedure for data reconstruction, visualization and clustering will be illustrated by simulation experiments and by preliminary results on single-cell data analysis. All proposed methods were implemented into two R-packages "plsgenomics" and "CMF" based on high performance computing

APA, Harvard, Vancouver, ISO, and other styles

21

Lopez, Montero Eduardo. "Use of multivariate statistical methods for control of chemical batch processes." Thesis, University of Manchester, 2016. https://www.research.manchester.ac.uk/portal/en/theses/use-of-multivariate-statistical-methods-for-control-of-chemical-batch-processes(6cf45624-2388-4e85-b4c6-99503547ad06).html.

Full text

Abstract:

In order to meet tight product quality specifications for chemical batch processes, it is vital to monitor and control product quality throughout the batch duration. However, the frequent lack of in situ sensors for continuous monitoring of batch product quality complicates the control problem and calls for novel control approaches. This thesis focuses on the study and application of multivariate statistical methods to control product quality in chemical batch processes. These multivariate statistical methods can be used to identify data-driven prediction models that can be integrated within a model predictive control (MPC) framework. The ideal MPC control strategy achieves end-product quality specifications by performing trajectory tracking during the batch operating time. However, due to the lack of in-situ sensors, measurements of product quality are usually obtained by laboratory assays and are, therefore, inherently intermittent. This thesis proposes a new approach to realise trajectory tracking control of batch product quality in those situations where only intermittent measurements are available. The scope of this methodology consists of: 1) the identification of a partial least squares (PLS) model that works as an estimator of product quality, 2) the transformation of the PLS model into a recursive formulation utilising a moving window technique, and 3) the incorporation of the recursive PLS model as a predictor into a standard MPC framework for tracking the desired trajectory of batch product quality. The structure of the recursive PLS model allows a straightforward incorporation of process constraints in the optimisation process. Additionally, a method to incorporate a nonlinear inner relation within the proposed PLS recursive model is introduced. This nonlinear inner relation is a combination of feedforward artificial neural networks (ANNs) and linear regression. Nonlinear models based on this method can predict product quality of highly nonlinear batch processes and can, therefore, be used within an MPC framework to control such processes. The use of linear regression in addition to ANNs within the PLS model reduces the risk of overfitting and also reduces the computational e↵ort of the optimisation carried out by the controller. The benefits of the proposed modelling and control methods are demonstrated using a number of simulated batch processes.

APA, Harvard, Vancouver, ISO, and other styles

22

Aloglu, Ahmet Kemal. "Characterization of Foods by Chromatographic and Spectroscopic Methods Coupled to Chemometrics." Ohio University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou152293360889416.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Wang, Hailun. "Some Conclusions of Statistical Analysis of the Spectropscopic Evaluation of Cervical Cancer." Digital Archive @ GSU, 2008. http://digitalarchive.gsu.edu/math_theses/58.

Full text

Abstract:

To significantly improve the early detection of cervical precancers and cancers, LightTouch™ is under development by SpectRx Inc.. LightTouch™ identifies cancers and precancers quickly by using a spectrometer to analyze light reflected from the cervix. Data from the spectrometer is then used to create an image of the cervix that highlights the location and severity of disease. Our research is conducted to find the appropriate models that can be used to generate map-like image showing disease tissue from normal and further diagnose the cervical cancerous conditions. Through large work of explanatory variable search and reduction, logistic regression and Partial Least Square Regression successfully applied to our modeling process. These models were validated by 60/40 cross validation and 10 folder cross validation. Further examination of model performance, such as AUC, sensitivity and specificity, threshold had been conducted.

APA, Harvard, Vancouver, ISO, and other styles

24

Plard, Jérôme. "Apport de la chimiométrie et des plans d’expériences pour l’évaluation de la qualité de l’huile d’olive au cours de différents processus de vieillissement." Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM4315/document.

Full text

Abstract:

L'huile d'olive est un élément important de l'alimentation méditerranéenne. Cependant lorsqu'une huile vieillit, elle se dégrade et perd ses propriétés. Il est donc important de connaitre l'évolution de la composition de l'huile en fonction de ses conditions de stockage et de fabrication. Ce suivi a été effectué sur deux huiles de fabrication différente, une huile fruité vert et une huile fruité noir, obtenue à partir d'olive à maturité que l'on a laissé fermenter quelques jours. De manière à obtenir rapidement des vieillissements poussés, ces deux huiles ont été vieillies artificiellement, par procédé thermique , et par procédé photochimique. Ces vieillissements ont été réalisés sur des volumes différents de manière à déterminer l'impact du rapport surface/masse. En parallèle, des échantillons de chacune des deux huiles ont été conservés durant 24 mois dans des conditions de stockage différentes déterminées à l'aide d'un plan d'expériences. Les paramètres influençant le plus la conservation de l'huile d'olive sont l'apport en oxygène, la luminosité et la température. Ces influences ont été déterminées à partir du suivi des principaux paramètres de qualité La réponse des plans a permis de mettre en évidence des interactions entre ces différents paramètres. L'analyse de la composition de l'huile ainsi que de tous les critères de qualité demande beaucoup de temps et consomme une grande quantité de solvant. Afin de pallier à ces désagréments, les résultats ont également été utilisés pour construire des modèles chimiométriques permettant de déterminer ces grandeurs à partir des spectres proche et moyen infrarouge des échantillons
Olive oil is an important component of the Mediterranean diet. When oil ages, it deteriorates and loses its properties. It is therefore important to know the evolution of the oil composition according to the conditions of storage and manufacturing. This monitoring was carried out on two different oils manufacturing, green fruity oil obtained from olives harvested before maturity, and black fruit oil obtained from olives harvest at maturity and fermented for few days under controlled conditions. To obtain quickly pushed aging, these two oils were artificially aged by heat process (heated to 180 °C under supply of O2), and photochemical process (under an UV lamp and under supply of O2). These aging were performed on different volumes to determine the impact of surface/weight ratio. In parallel, samples of both oils were stored for 24 months under different storage conditions determined using an experimental design. The parameters affecting the most the conservation of olive oil are oxygen, light and temperature. These influences were determined from the monitoring of key quality criteria. Response of experimental design helped to highlight the interactions between these different parameters. The analysis of the oil composition as well as all the quality criteria requires a large amount of solvents and a lot of time consumer. To overcome these inconveniences, chemometric models has been built to determine these criteria from the near and mid-infrared spectra of samples. Natural aging is very little advanced in comparison to accelerated aging, so predictive models were established from the results of natural aging and accelerated separately

APA, Harvard, Vancouver, ISO, and other styles

25

Edberg, Alexandra. "Monitoring Kraft Recovery Boiler Fouling by Multivariate Data Analysis." Thesis, KTH, Skolan för kemi, bioteknologi och hälsa (CBH), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230906.

Full text

Abstract:

This work deals with fouling in the recovery boiler at Montes del Plata, Uruguay. Multivariate data analysis has been used to analyze the large amount of data that was available in order to investigate how different parameters affect the fouling problems. Principal Component Analysis (PCA) and Partial Least Square Projection (PLS) have in this work been used. PCA has been used to compare average values between time periods with high and low fouling problems while PLS has been used to study the correlation structures between the variables and consequently give an indication of which parameters that might be changed to improve the availability of the boiler. The results show that this recovery boiler tends to have problems with fouling that might depend on the distribution of air, the black liquor pressure or the dry solid content of the black liquor. The results also show that multivariate data analysis is a powerful tool for analyzing these types of fouling problems.
Detta arbete handlar om inkruster i sodapannan pa Montes del Plata, Uruguay. Multivariat dataanalys har anvands for att analysera den stora datamangd som fanns tillganglig for att undersoka hur olika parametrar paverkar inkrusterproblemen. Principal·· Component Analysis (PCA) och Partial Least Square Projection (PLS) har i detta jobb anvants. PCA har anvants for att jamfora medelvarden mellan tidsperioder med hoga och laga inkrusterproblem medan PLS har anvants for att studera korrelationen mellan variablema och darmed ge en indikation pa vilka parametrar som kan tankas att andras for att forbattra tillgangligheten pa sodapannan. Resultaten visar att sodapannan tenderar att ha problem med inkruster som kan hero pa fdrdelningen av luft, pa svartlutens tryck eller pa torrhalten i svartluten. Resultaten visar ocksa att multivariat dataanalys ar ett anvandbart verktyg for att analysera dessa typer av inkrusterproblem.

APA, Harvard, Vancouver, ISO, and other styles

26

Yan, Lipeng. "The application of multivariate statistical analysis and optimization to batch processes." Thesis, University of Manchester, 2015. https://www.research.manchester.ac.uk/portal/en/theses/the-application-of-multivariate-statistical-analysis-and-optimization-to-batch-processes(e6dbe45d-94bb-4e84-a12f-542876af54f5).html.

Full text

Abstract:

Multivariate statistical process control (MSPC) techniques play an important role in industrial batch process monitoring and control. This research illustrates the capabilities and limitations of existing MSPC technologies, with a particular focus on partial least squares (PLS).In modern industry, batch processes often operate over relatively large spaces, with many chemical and physical systems displaying nonlinear performance. However, the linear PLS model cannot predict nonlinear systems, and hence non-linear extensions to PLS may be required. The nonlinear PLS model can be divided into Type I and Type II nonlinear PLS models. In the Type I Nonlinear PLS method, the observed variables are appended with nonlinear transformations. In contrast to the Type I nonlinear PLS method, the Type II nonlinear PLS method assumes a nonlinear relationship within the latent variable structure of the model. Type I and Type II nonlinear multi-way PLS (MPLS) models were applied to predict the endpoint value of the product in a benchmark simulation of a penicillin batch fermentation process. By analysing and comparing linear MPLS, and Type I and Type II nonlinear MPLS models, the advantages and limitations of these methods were identified and summarized. Due to the limitations of Type I and II nonlinear PLS models, in this study, Neural Network PLS (NNPLS) was proposed and applied to predict the final product quality in the batch process. The application of the NNPLS method is presented with comparison to the linear PLS method, and to the Type I and Type II nonlinear PLS methods. Multi-way NNPLS was found to produce the most accurate results, having the added advantage that no a-priori information regarding the order of the dynamics was required. The NNPLS model was also able to identify nonlinear system dynamics in the batch process. Finally, NNPLS was applied to build the controller and the NNPLS method was combined with the endpoint control algorithm. The proposed controller was able to be used to keep the endpoint value of penicillin and biomass concentration at a set-point.

APA, Harvard, Vancouver, ISO, and other styles

27

Zitzewitz, Mareike von [Verfasser]. "Konzeptualisierung und Operationalisierung der Unternehmensreputation aus Sicht privater Anleger : Implikationen für Forschung und Praxis auf Basis empirischer Analysen unter Verwendung des Partial-Least-Squares-Ansatzes / Mareike von Zitzewitz." Hannover : Technische Informationsbibliothek (TIB), 2017. http://d-nb.info/1136340912/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Hassel, Per Anker. "Nonlinear partial least squares." Thesis, University of Newcastle Upon Tyne, 2003. http://hdl.handle.net/10443/465.

Full text

Abstract:

Partial Least Squares (PLS) has been shown to be a versatile regression technique with an increasing number of applications in the areas of process control, process monitoring and process analysis. This Thesis considers the area of nonlinear PLS; a nonlinear projection based regression technique. The nonlinearity is introduced as a univariate nonlinear function between projections, or to be more specific, linear combinations of the predictor and the response variables. As for the linear case, the method should handle multicollinearity, underdetermined and noisy systems. Although linear PLS is accepted as an empirical regression method, none of the published nonlinear PLS algorithms have achieved widespread acceptance. This is confirmed from a literature survey where few real applications of the methodology were found. This Thesis investigates two nonlinear PLS methodologies, in particular focusing on their limitations. Based on these studies, two nonlinear PLS algorithms are proposed. In the first of the two existing approaches investigated, the projections are updated by applying an optimization method to reduce the error of the nonlinear inner mapping. This ensures that the error introduced by the nonlinear inner mapping is minimized. However, the procedure is limited as a consequence of problems with the nonlinear optimisation. A new algorithm, Nested PLS (NPLS), is developed to address these issues. In particular, a separate inner PLS is used to update the projections. The NPLS algorithm is shown to outperform existing algorithms for a wide range of regression problems and has the potential to become a more widely accepted nonlinear PLS algorithm than those currently reported in the literature. In the second of the existing approaches, the projections are identified by examining each variable independently, as opposed to minimizing the error of the nonlinear inner mapping directly. Although the approach does not necessary identify the underlying functional relationship, the problems of overfitting and other problems associated with optimization are reduced. Since the underlying functional relationship may not be established accurately, the reliability of the nonlinear inner mapping will be reduced. To address this problem a new algorithm, the Reciprocal Variance PLS (RVPLS), is proposed. Compared with established methodology, RVPLS focus more on finding the underlying structure, thus reducing the difficulty of finding an appropriate inner mapping. RVPLS is shown to perform well for a number of applications, but does not have the wide-ranging performance of Nested PLS.

APA, Harvard, Vancouver, ISO, and other styles

29

Hellberg, Sven. "A multivariate approach to QSAR." Doctoral thesis, Umeå universitet, Kemiska institutionen, 1986. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-100713.

Full text

Abstract:

Quantitative structure-activity relationships (OSAR) constitute empirical analogy models connecting chemical structure and biological activity. The analogy approach to QSAR assume that the factors important in the biological system also are contained in chemical model systems. The development of a QSAR can be divided into subproblems: 1. to quantify chemical structure in terms of latent variables expressing analogy, 2. to design test series of compounds, 3. to measure biological activity and 4. to construct a mathematical model connecting chemical structure and biological activity. In this thesis it is proposed that many possibly relevant descriptors should be considered simultaneously in order to efficiently capture the unknown factors inherent in the descriptors. The importance of multivariately and multipositionally varied test series is discussed. Multivariate projection methods such as PCA and PLS are shown to be appropriate far QSAR and to closely correspond to the analogy assumption. The multivariate analogy approach is applied to a beta- adrenergic agents, b haloalkanes, c halogenated ethyl methyl ethers and d four different families of peptides.

Diss. (sammanfattning) Umeå : Umeå universitet, 1986, härtill 8 uppsatser

digitalisering@umu

APA, Harvard, Vancouver, ISO, and other styles

30

Padalkar, Mugdha Vijay. "DEVELOPMENT OF NON-DESTRUCTIVE INFRARED FIBER OPTIC METHOD FOR ASSESSMENT OF LIGAMENT AND TENDON COMPOSITION." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/378679.

Full text

Abstract:

Bioengineering
Ph.D.
More than 350,000 anterior cruciate ligament (ACL) injuries occur every year in the United States. A torn ACL is typically replaced with an allograft or autograft tendon (patellar, quadriceps or hamstring), with the choice of tissue generally dictated by surgeon preference. Despite the number of ACL reconstructions performed every year, the process of ligamentization, transformation of a tendon graft to a healthy functional ligament, is poorly understood. Previous research studies have relied on mechanical, biochemical and histological studies. However, these methods are destructive. Clinically, magnetic resonance imaging (MRI) is the most common method of graft evaluation, but it lacks adequate resolution and molecular specificity. There is a need for objective methodology to study the ligament repair process that would ideally be non- or minimally invasive. Development of such a method could lead to a better understanding of the effects of therapeutic interventions and rehabilitation protocols in animal models of ligamentization, and ultimately, in clinical studies. Fourier transform infrared (FT-IR) spectroscopy is a technique sensitive to molecular structure and composition in tissues. FT-IR fiber optic probes combined with arthroscopy could prove to be an important tool where minimally invasive tissue assessment is required, such as assessment of graft composition during the ligamentization process. Spectroscopic methods have been used to differentiate normal and diseased connective tissues, but have not been applied to investigate ligamentization, or to investigate differences in tendons and ligaments. In the proposed studies, we hypothesize that infrared spectroscopy can provide molecular information about the compositional differences between tendons and ligaments, which can serve as a foundation to non-destructively monitor the tissue transformation that occurs during ligamentization.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

31

Champion, Patrick D. "An analysis of Fourier transform infrared spectroscopy data to predict herpes simplex virus 1 infection." Atlanta, Ga. : Georgia State University, 2008. http://digitalarchive.gsu.edu/math_theses/62/.

Full text

Abstract:

Thesis (M.S.)--Georgia State University, 2008.
Title from title page (Digital Archive@GSU, viewed July 29, 2010) Yu-Sheng Hsu, committee chair; Gary Hastings, Jun Han, committee members. Includes bibliographical references (p. 41).

APA, Harvard, Vancouver, ISO, and other styles

32

Alves, Evandro Roberto. "Sistemas de análises químicas em fluxo explorando multi-impulsão, interface única ou quimiometria." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/64/64135/tde-14052010-092516/.

Full text

Abstract:

Os sistemas de análises em fluxo com multi-impulsão (MPFS) têm como característica principal o emprego de bombas solenóide como unidade propulsora de fluidos, as quais proporcionam fluxo pulsado. Este regime de fluxo foi avaliado em função das condições de mistura entre as soluções envolvidas, transferência de calor e difusão gasosa. A associação dos métodos quimiométricos de análises e dos sistemas MPFS foi demonstrada em relação à determinação espectrofotométrica de glicose, frutose e glicerol em vinhos fermentiscíveis e caldos de cana-de-açúcar. O método se fundamentou na reação dos carboidratos com metaperiodato de sódio e posterior oxidação de iodeto pelo metaperiodato remanescente com monitoramento de [I3 -] produzido. O tratamento dos dados envolveu calibração multivariada, empregando o algoritmo PLS e os resultados são concordantes com aqueles obtidos por cromatografia líquida de troca aniônica com detecção por amperometria pulsada. O sistema proposto é simples e robusto, capaz de analisar 120 amostras por hora. O fluxo pulsado proporcionou melhoria no desenvolvimento reacional no que diz respeito à transferência de calor e difusão gasosa. Esse aspecto se deve principalmente ao aumento do transporte de massas no sentido radial. Estes fatos foram constatados na determinação espectrofotométrica de açúcares redutores totais (ART) e etanol. O sistema MPFS proposto para a determinação de ART envolveu hidrólise ácida da sacarose e degradação alcalina dos carboidratos. A natureza do fluxo pulsado possibilitou o uso de menores temperaturas de um banho termostatizado durante as etapas de hidrólise e degradação, bem como a diminuição da alcalinidade. Para a mecanização da determinação espectrofotométrica de etanol envolvendo a redução de Cr(VI) a Cr(III) sob condições ácidas, foi desenvolvido um sistema MPFS, o qual se demonstrou eficiente e adequado para procedimentos que envolvem difusão gasosa. Após otimização dos principais parâmetros envolvidos, os mesmos foram comparados empregando o sistema de multi-comutação, cujo fluxo é laminar. Melhores resultados analíticos foram obtidos no sistema proposto, que resultou em boa sensibilidade. Em relação aos sistemas de análises em fluxo que exploram interface reacional única (SIFA), foram demonstradas suas potencialidades através da implementação de procedimentos que envolvem determinações simultâneas, sem a necessidade de reconfigurações no módulo de análises. Ainda, a simplificação da etapa de otimização foi espectrofotometricamente avaliada através da determinação de alumínio, ferro total e P-PO4. O sistema proposto é de configuração simples e capaz de analisar 130, 140 e 90 amostras de alumínio, ferro total e fósforo por hora, respectivamente
Multi-pumping flow systems (MPFS) present as an unique feature the use of solenoid pumps as fluid propelling devices, which deliver pulsed flows. This flow regime was evaluated in order to improve mixing conditions between the involved solutions, heating transfer and gas diffusion.The association of the chemometric methods of analysis and MPFS systems was demonstrated in the spectrophotometric determination of glucose, fructose and glycerol in musts and sugar cane juices. The method involved metaperiodate oxidation of carbohydrates and further oxidation of remainder metaperiodate iodide yield in the [I3 -] complex that was spectrophotometrically monitored. Data treatment involved multivariate calibration relying on the PLS algorithm and results were in agreement with liquid anion chromatography with pulsed amperometric detection. The proposed system is simple and rugged, allowing 120 samples to be run per hour. The pulsed flow led to a enhanced in heating transfer and gas diffusion, in view of the enhanced radial mass transport. These aspects were verified in the spectrophotometric determination of total reducing sugars (TRS) and ethanol. The proposed MPFS system for TRS determination involved in-line hydrolysis of sucrose and alkaline degradation of the carbohydrates. The intrinsic characteristic of pulsed flow allowed the use of lower temperatures in bath thermostatization during hydrolysis and degradation steps, as well as a lower alkalinity. The MPFS for spectrophotometric determination of ethanol involving diffusion towards an acceptor stream, reduction of Cr(VI) to Cr(III) under acidic condition, and Cr*(III) monitoring proved to be eficient and amenale to analytical procedures involving gas diffusion. After optimization of the main parameters, the system was compared with a multicommuted flow system (MCFA) that exploits a laminar flow. Better analytical results were obtained with the proposed system which demonstrated fair sensitivity. Regarding flow systems exploiting a single reaction interface (SIFA), their potentialities were demonstrated by implementing analytical procedures for simultaneous determination without requiring reconfigurations in the flow manifold. In this proposed system the simplification of the optimization step was atained, and the approach was evaluated in relation to spectrophotometrically determination of aluminum, total iron and phosphate. The system exhibits simple configuration and allows 130, 140 and 90 samples of aluminum, total iron and phosphate to be run per hour, respectivelly

APA, Harvard, Vancouver, ISO, and other styles

33

Flaxman, Teresa. "Neuromuscular Strategies for Regulating Knee Joint Moments in Healthy and Injured Populations." Thesis, Université d'Ottawa / University of Ottawa, 2017. http://hdl.handle.net/10393/36102.

Full text

Abstract:

Background: Joint stability has been experimentally and clinically linked to mechanisms of knee injury and joint degeneration. The only dynamic, and perhaps most important, regulators of knee joint stability are contributions from muscular contractions. In participants with unstable knees, such as anterior cruciate ligament (ACL) injured, a range of neuromuscular adaptations has been observed including quadriceps weakness and increased co-activation of adjacent musculature. This co-activation is seen as a compensation strategy to increase joint stability. In fact, despite increased co-activation, instability persists and it remains unknown whether observed adaptations are the result of injury induced quadriceps weakness or the mechanical instability itself. Furthermore, there exists conflicting evidence on how and which of the neuromuscular adaptations actually improve and/or reduce knee joint stability. Purpose: The overall aim of this thesis is therefore to elucidate the role of injury and muscle weakness on muscular contributions to knee joint stability by addressing two main objectives: (1) to further our understanding of individual muscle contribution to internal knee joint moments; and (2) to investigate neuromuscular adaptations, and their effects on knee joint moments, caused by either ACL injury and experimental voluntary quadriceps inhibition (induced by pain). Methods: The relationship between individual muscle activation and internal net joint moments was quantified using partial least squares regression models. To limit the biomechanical contributions to force production, surface electromyography (EMG) and kinetic data was elicited during a weight-bearing isometric force matching task. A cross-sectional study design determined differences in individual EMG-moment relationships between ACL deficient and healthy controls (CON) groups. A crossover placebo controlled study design determined these differences in healthy participants with and without induced quadriceps muscle pain. Injections of hypertonic saline (5.8%) to the vastus medialis induced muscle pain. Isotonic saline (0.9%) acted as control. Effect of muscle pain on muscle synergies recruited for the force matching task, lunging and squatting tasks was also evaluated. Synergies were extracted using a concatenated non-negative matrix factorization framework. Results/Discussion: In CON, significant relationships of the rectus femoris and tensor fascia latae to knee extension and hip flexion; hamstrings to hip extension and knee flexion; and gastrocnemius and hamstrings to knee rotation were identified. Vastii activation was independent of moment generation, suggesting mono-articular vastii activate to produce compressive forces, essentially bracing the knee, so that bi-articular muscles crossing the hip can generate moments for the purpose of sagittal plane movement. Hip ab/adductor muscles modulate frontal plane moments, while hamstrings and gastrocnemius support the knee against externally applied rotational moments. Compared to CON, ACL had 1) stronger relationships between rectus femoris and knee extension, semitendinosus and knee flexion, and gastrocnemius and knee flexion moments; and 2) weaker relationships between biceps femoris and knee flexion, gastrocnemius and external knee rotation, and gluteus medius and hip abduction moments. Since the knee injury mechanism, is associated with shallow knee flexion angles, valgus alignment and rotation, adaptations after ACL injury are suggested to improve sagittal plane stability, but reduce frontal and rotational plane stability. During muscle pain, EMG-moment relationships of 1) semitendinosus and knee flexor moments were stronger compared to no pain, while 2) rectus femoris and tensor fascia latae to knee extension moments and 3) semitendinosus and lateral gastrocnemius to knee internal rotation moments were reduced. Results support the theory that adaptations to quadriceps pain reduces knee extensor demand to protect the joint and prevent further pain; however, changes in non-painful muscles reduce rotational plane stability. Individual muscle synergies were identified for each moment type: flexion and extension moments were respectively accompanied by dominant hamstring and quadriceps muscle synergies while co-activation was observed in muscle synergies associated with abduction and rotational moments. Effect of muscle pain was not evident on muscle synergies recruited for the force matching task. This may be due to low loading demands and/or a subject-specific redistribution of muscle activation. Similarly, muscle pain did not affect synergy composition in lunging and squatting tasks. Rather, activation of the extensor dominant muscle synergy and knee joint dynamics were reduced, supporting the notion that adaptive response to pain is to reduce the load and risk of further pain and/or injury. Conclusion: This thesis evaluated the interrelationship between muscle activation and internal joint moments and the effect of ACL injury and muscle pain on this relationship. Findings indicate muscle activation is not always dependent on its anatomical orientation as previous works suggest, but rather on its role in maintaining knee joint stability especially in the frontal and transverse loading planes. In tasks that are dominated by sagittal plane loads, hamstring and quadriceps will differentially activate. However, when the knee is required to resist externally applied rotational and abduction loads, strategies of global co-activation were identified. Contributions from muscles crossing the knee for supporting against knee adduction loads were not apparent. Alternatively hip abductors were deemed more important regulators of knee abduction loads. Both muscle pain and ACL groups demonstrated changes in muscle activation that reduced rotational stability. Since frontal plane EMG-moment changes were not present during muscle pain, reduced relationships between hip muscles and abduction moments may be chronic adaptions by ACL that facilitate instability. Findings provide valuable insight into the roles muscles play in maintaining knee joint stability. Rehabilitative/ preventative exercise interventions should focus on neuromuscular training during tasks that elicit rotational and frontal loads (i.e. side cuts, pivoting maneuvers) as well as maintaining hamstring balance, hip abductor and plantarflexor muscle strength in populations with knee pathologies and quadriceps muscle weakness.

APA, Harvard, Vancouver, ISO, and other styles

34

Liggett, Rachel Esther. "Multivariate Approaches for Relating Consumer Preference to Sensory Characteristics." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1282868174.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Bergfors, Linus. "Explorative Multivariate Data Analysis of the Klinthagen Limestone Quarry Data." Thesis, Uppsala University, Department of Information Technology, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-122575.

Full text

Abstract:

The today quarry planning at Klinthagen is rough, which provides an opportunity to introduce new exciting methods to improve the quarry gain and efficiency. Nordkalk AB, active at Klinthagen, wishes to start a new quarry at a nearby location. To exploit future quarries in an efficient manner and ensure production quality, multivariate statistics may help gather important information.

In this thesis the possibilities of the multivariate statistical approaches of Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression were evaluated on the Klinthagen bore data. PCA data were spatially interpolated by Kriging, which also was evaluated and compared to IDW interpolation.

Principal component analysis supplied an overview of the variables relations, but also visualised the problems involved when linking geophysical data to geochemical data and the inaccuracy introduced by lacking data quality.

The PLS regression further emphasised the geochemical-geophysical problems, but also showed good precision when applied to strictly geochemical data.

Spatial interpolation by Kriging did not result in significantly better approximations than the less complex control interpolation by IDW.

In order to improve the information content of the data when modelled by PCA, a more discrete sampling method would be advisable. The data quality may cause trouble, though with sample technique of today it was considered to be of less consequence.

Faced with a single geophysical component to be predicted from chemical variables further geophysical data need to complement existing data to achieve satisfying PLS models.

The stratified rock composure caused trouble when spatially interpolated. Further investigations should be performed to develop more suitable interpolation techniques.

APA, Harvard, Vancouver, ISO, and other styles

36

Nothnagel, Carien. "Multivariate data analysis using spectroscopic data of fluorocarbon alcohol mixtures / Nothnagel, C." Thesis, North-West University, 2012. http://hdl.handle.net/10394/7064.

Full text

Abstract:

Pelchem, a commercial subsidiary of Necsa (South African Nuclear Energy Corporation), produces a range of commercial fluorocarbon products while driving research and development initiatives to support the fluorine product portfolio. One such initiative is to develop improved analytical techniques to analyse product composition during development and to quality assure produce. Generally the C–F type products produced by Necsa are in a solution of anhydrous HF, and cannot be directly analyzed with traditional techniques without derivatisation. A technique such as vibrational spectroscopy, that can analyze these products directly without further preparation, will have a distinct advantage. However, spectra of mixtures of similar compounds are complex and not suitable for traditional quantitative regression analysis. Multivariate data analysis (MVA) can be used in such instances to exploit the complex nature of spectra to extract quantitative information on the composition of mixtures. A selection of fluorocarbon alcohols was made to act as representatives for fluorocarbon compounds. Experimental design theory was used to create a calibration range of mixtures of these compounds. Raman and infrared (NIR and ATR–IR) spectroscopy were used to generate spectral data of the mixtures and this data was analyzed with MVA techniques by the construction of regression and prediction models. Selected samples from the mixture range were chosen to test the predictive ability of the models. Analysis and regression models (PCR, PLS2 and PLS1) gave good model fits (R2 values larger than 0.9). Raman spectroscopy was the most efficient technique and gave a high prediction accuracy (at 10% accepted standard deviation), provided the minimum mass of a component exceeded 16% of the total sample. The infrared techniques also performed well in terms of fit and prediction. The NIR spectra were subjected to signal saturation as a result of using long path length sample cells. This was shown to be the main reason for the loss in efficiency of this technique compared to Raman and ATR–IR spectroscopy. It was shown that multivariate data analysis of spectroscopic data of the selected fluorocarbon compounds could be used to quantitatively analyse mixtures with the possibility of further optimization of the method. The study was a representative study indicating that the combination of MVA and spectroscopy can be used successfully in the quantitative analysis of other fluorocarbon compound mixtures.
Thesis (M.Sc. (Chemistry))--North-West University, Potchefstroom Campus, 2012.

APA, Harvard, Vancouver, ISO, and other styles

37

RENTERIA, RAUL PIERRE. "ALGORITHMS FOR PARTIAL LEAST SQUARES REGRESSION." PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2003. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=4362@1.

Full text

Abstract:

CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
Muitos problemas da área de aprendizagem automática tem por objetivo modelar a complexa relação existente num sisitema , entre variáveis de entrada X e de saída Y na ausência de um modelo teórico. A regressão por mínimos quadrados parciais PLS ( Partial Least Squares) constitui um método linear para resolução deste tipo de problema , voltado para o caso de um grande número de variáveis de entrada quando comparado com número de amostras. Nesta tese , apresentamos uma variante do algoritmo clássico PLS para o tratamento de grandes conjuntos de dados , mantendo um bom poder preditivo. Dentre os principais resultados destacamos um versão paralela PPLS (Parallel PLS ) exata para o caso de apenas um variável de saída e um versão rápida e aproximada DPLS (DIRECT PLS) para o caso de mais de uma variável de saída. Por outro lado ,apresentamos também variantes para o aumento da qualidade de predição graças à formulação não linear. São elas o LPLS ( Lifted PLS ), algoritmo para o caso de apenas uma variável de saída, baseado na teoria de funções de núcleo ( kernel functions ), uma formulação kernel para o DPLS e um algoritmo multi-kernel MKPLS capaz de uma modelagemmais compacta e maior poder preditivo, graças ao uso de vários núcleos na geração do modelo.
The purpose of many problems in the machine learning field isto model the complex relationship in a system between the input X and output Y variables when no theoretical model is available. The Partial Least Squares (PLS)is one linear method for this kind of problem, for the case of many input variables when compared to the number of samples. In this thesis we present versions of the classical PLS algorithm designed for large data sets while keeping a good predictive power. Among the main results we highlight PPLS (Parallel PLS), a parallel version for the case of only one output variable, and DPLS ( Direct PLS), a fast and approximate version, for the case fo more than one output variable. On the other hand, we also present some variants of the regression algorithm that can enhance the predictive quality based on a non -linear formulation. We indroduce LPLS (Lifted PLS), for the case of only one dependent variable based on the theory of kernel functions, KDPLS, a non-linear formulation for DPLS, and MKPLS, a multi-kernel algorithm that can result in a more compact model and a better prediction quality, thankas to the use of several kernels for the model bulding.

APA, Harvard, Vancouver, ISO, and other styles

38

Le, Floch Edith. "Méthodes multivariées pour l'analyse jointe de données de neuroimagerie et de génétique." Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00753829.

Full text

Abstract:

L'imagerie cérébrale connaît un intérêt grandissant, en tant que phénotype intermédiaire, dans la compréhension du chemin complexe qui relie les gènes à un phénotype comportemental ou clinique. Dans ce contexte, un premier objectif est de proposer des méthodes capables d'identifier la part de variabilité génétique qui explique une certaine part de la variabilité observée en neuroimagerie. Les approches univariées classiques ignorent les effets conjoints qui peuvent exister entre plusieurs gènes ou les covariations potentielles entre régions cérébrales.Notre première contribution a été de chercher à améliorer la sensibilité de l'approche univariée en tirant avantage de la nature multivariée des données génétiques, au niveau local. En effet, nous adaptons l'inférence au niveau du cluster en neuroimagerie à des données de polymorphismes d'un seul nucléotide (SNP), en cherchant des clusters 1D de SNPs adjacents associés à un même phénotype d'imagerie. Ensuite, nous prolongeons cette idée et combinons les clusters de voxels avec les clusters de SNPs, en utilisant un test simple au niveau du "cluster 4D", qui détecte conjointement des régions cérébrale et génomique fortement associées. Nous obtenons des résultats préliminaires prometteurs, tant sur données simulées que sur données réelles.Notre deuxième contribution a été d'utiliser des méthodes multivariées exploratoires pour améliorer la puissance de détection des études d'imagerie génétique, en modélisant la nature multivariée potentielle des associations, à plus longue échelle, tant du point de vue de l'imagerie que de la génétique. La régression Partial Least Squares et l'analyse canonique ont été récemment proposées pour l'analyse de données génétiques et transcriptomiques. Nous proposons ici de transposer cette idée à l'analyse de données de génétique et d'imagerie. De plus, nous étudions différentes stratégies de régularisation et de réduction de dimension, combinées avec la PLS ou l'analyse canonique, afin de faire face au phénomène de sur-apprentissage dû aux très grandes dimensions des données. Nous proposons une étude comparative de ces différentes stratégies, sur des données simulées et des données réelles d'IRM fonctionnelle et de SNPs. Le filtrage univarié semble nécessaire. Cependant, c'est la combinaison du filtrage univarié et de la PLS régularisée L1 qui permet de détecter une association généralisable et significative sur les données réelles, ce qui suggère que la découverte d'associations en imagerie génétique nécessite une approche multivariée.

APA, Harvard, Vancouver, ISO, and other styles

39

Ozer, Semih. "Analysis Of Critical Factors Affecting Customer Satisfaction In Modular Kitchen Sector." Master's thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/2/12610593/index.pdf.

Full text

Abstract:

This study starts with the review of the literature in customer satisfaction, customer satisfaction methods and models. After selecting a proper customer satisfaction method and model, the study conducts a survey and a questionnaire among the customers and professionals in the modular kitchen sector. The aim of the study is to analyze the factors affecting customer satisfaction and finding out the ones related with the modular kitchen sector. After applying the survey, the relations between the inputs and outputs of the satisfaction are analyzed with the overall satisfaction itself. The strong and weak factors are determined and a proper CRM tool is build-up to realize a decision-support and forecast tool in the study, which can be seen as a beginning for the companies in the real sector in this business to build a much more detailed and ERP integrated software and to use them. The results of the survey are compared with the similar studies from the literature.

APA, Harvard, Vancouver, ISO, and other styles

40

Larsson, Daniel. "Multivariat dataanalys för att undersöka skillnader i undervisnings- och bedömningspraxis i kursen kemi 2." Thesis, Linnéuniversitetet, Institutionen för didaktik och lärares praktik (DLP), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-70394.

Full text

Abstract:

Trots att det inom forskningsvärlden propageras för formativ bedömning, kan man i dagsläget notera en mycket stor variation gällande införlivandet av, samt effekter av, formativ bedömning i skolor. Metoder för att kartlägga formativ bedömningspraxis fordras för att kunna särskilja på ”god” respektive ”mindre god” formativ bedömningspraxis. Syftet med föreliggande uppsats var att, med hjälp av en elevenkät och multivariata projektionsmetoder såsom PCA och PLS-DA, kartlägga, och särskilja, formativ bedömningspraxis hos sex olika gymnasieklasser som genomfört kursen kemi 2. Ett sekundärt syfte var även att, med samma verktyg, försöka karakterisera och särskilja frekvenser av olika genomförda undervisningsmoment inom samma kurs och klasser. Studien visade, på ett grafiskt och illustrativt sätt, en stor variation av upplevelser av formativ bedömning inom de tillfrågade klasserna. Vidare visade sig PCA vara ett utmärkt verktyg för att identifiera elevsvar som låg utanför den ”normala” variationen. Genom en PLS-DA-analys påvisades en skillnad i frekvenser av genomförda undervisningsmoment mellan två kommunala och en privat skola – även om dessa resultat bör tolkas med en viss försiktighet.

APA, Harvard, Vancouver, ISO, and other styles

41

Rogers, C. A. "Partial least squares (PLS) : a comparative assessment." Thesis, University of Bath, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.235583.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Mohiddin, Syed B. "Development of novel unsupervised and supervised informatics methods for drug discovery applications." The Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc_num=osu1138385657.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Fortes, Paula Regina. ""Calibração multivariada e cinética diferencial em sistemas de análises em fluxo com detecção espectrofotométrica"." Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/64/64132/tde-14082006-120124/.

Full text

Abstract:

A associação dos métodos cinéticos de análises e dos sistemas de análises em fluxo foi demonstrada em relação à determinação espectrofotométrica de ferro e vanádio em ligas Fe-V O método se baseia na influência de Fe2+ e VO2+ na taxa de oxidação de iodeto por dicromato sob condições ácidas; por esta razão o emprego do redutor de Jones foi necessário. Um sistema de análises por injeção em fluxo (FIA) e um sistema multi-impulsão foram dimensionados e avaliados. Em ambos os sistemas, a solução da amostra era inserida no fluxo transportador / reagente iodeto, e a solução de dicromato era adicionada por confluência. Sucessivas medidas eram realizadas durante a passagem da zona de amostra processada pelo detector, cada uma relacionada a uma diferente condição para o desenvolvimento da reação. O tratamento dos dados envolveu calibração multivariada, particularmente o algorítmo PLS. O sistema FIA se mostrou pouco adequado para as determinações multi-paramétricas, uma vez que os elementos de fluído resultantes da natureza de escoamento laminar não continham informações cinéticas suficientes para compor as etapas de modelagem. Por outro lado, MPFS mostrou que a natureza do fluxo pulsado resulta em melhorias nas figuras de mérito devido ao movimento caótico dos elementos de fluído. O sistema proposto é simples e robusto, capaz de analisar 50 amostras por hora, significando em um consumo de 48 mg KI por determinação. A duas primeiras variáveis latentes contém ca 94 % da informação analítica, mostrando que a dimensionalidade dupla intrínsica ao conjunto de dados. Os resultados se apresentaram concordantes com aqueles obtidos por espectrometria de emissão optica com plasma induzido em argônio.
Differential kinetic analysis can be implemented in a flow system analyser, and this was demonstrated in designing an improved spectrophotometric catalytic determination of iron and vanadium in Fe-V alloys. The method relied on the influence of Fe2+ and VO2+ on the rate of the iodide oxidation by Cr2O7 under acidic conditions; therefore the Jones reductor was needed. To this end, a flow injection system (FIA) and a multi-pumping flow system (MPFS) were dimensioned and evaluated. In both systems, the alloy solution was inserted into an acidic KI solution that acted also as carrier stream, and a dichromate solution was added by confluence. Successive measurements were performed during sample passage through the detector, each one related to a different yet reproducible condition for reaction development. Data treatment involved multivariate calibration by the PLS algorithm. The FIA system was less recommended for multi-parametric determination, as the laminar flow regimen could not provide suitable kinetic information. On the other hand, a MPFS demonstrated that pulsed flow led to enhance figures of merit due to chaotic movement of its fluid elements. The proposed MPFS system is very simple and rugged, allowing 50 samples to be run per hour, meaning 48 mg KI per determination. The first two latent variables carry ca 94 % of the analytical information, pointing out that the intrinsic dimensionality of the data set is two. Results are in agreement with inductively coupled argon plasma optical emission spectrometry.

APA, Harvard, Vancouver, ISO, and other styles

44

Sasaki, Milton Katsumi. "Projeto e desenvolvimento de um sistema de análises químicas por injeção em fluxo para determinações espectrofotométricas simultâneas de cobre e de níquel explorando cinética diferencial e calibração multivariada." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/64/64135/tde-08022012-094037/.

Full text

Abstract:

Análise cinética diferencial explora diferenças em taxas reacionais entre os analitos e um sistema reacional comum; etapas de separação prévia dos analitos podem então ser prescindidas. Sistemas de análise por injeção em fluxo (FIA) se afiguram como uma ferramenta importante para métodos envolvendo essa estratégia, pois permitem um controle preciso da dispersão de reagentes / amostras e da temporização. O objetivo deste trabalho foi então explorar estes dois aspectos favoráveis visando a determinação simultânea de cobre e de níquel, a partir de suas reações com o reagente cromogênico 5-Br-PADAP. Três alíquotas de amostra eram simultaneamente inseridas, por meio de um injetor proporcional, no fluxo transportador reagente (5-Br-PADAP 75 mg L-1 + sistema tampão 0,5 mol L-1 em ácido acético / acetato, pH 4,7) de um sistema FIA em linha única. Durante o transporte em direção ao detector, as zonas estabelecidas se coalesciam, originando uma zona complexa que era monitorada a 562 nm. Os valores locais máximos e mínimos da função concentração / tempo obtida eram considerados para calibração multivariada utilizando a ferramenta quimiométrica PLS-2 (partial least squares - 2). A concentração do reagente, a capacidade tampão, a temperatura, a vazão, os comprimentos do percurso analítico e das alças de amostragem, bem como a distância inicial entre as zonas de amostra estabelecidas foram avaliados para construção dos modelos matemáticos. Estes foram criados a partir de 24 soluções-padrão mistas de Cu2+ e Ni2+ (0,00-1,60 mg L-1 em HNO3 a 0,1% v/v). Duas variáveis latentes foram suficientes para capturar > 98 % das variâncias inerentes ao conjunto de dados e erros médios das previsões (RMSEP) foram estimados em 0,025 e 0,071 mg L-1 para Cu e Ni, salientando a boa precisão do modelo de calibração. O sistema proposto apresenta boas figuras de mérito: fisicamente estável, quando mantido em operação por quatro horas ininterruptas, consumo de 314 \'mü\'g 5-Br-PADAP por amostra, frequência analítica de 33 amostras por hora (165 dados, 66 determinações) e erros nas leituras em sinais de absorbância tipicamente < 5%. Entretanto, verificou-se a inexatidão das previsões efetuadas pelo modelo proposto, quando comparadas aos resultados obtidos por ICP OES. A partir deste fato, tornam-se necessários maiores estudos referentes a este tipo de matriz, bem como de técnicas de mascaramento dos possíveis interferentes presentes
Differential kinetic analysis exploits the differences in reaction rates between the analytes and a common reactant system; prior steps of analyte separation can then be waived. Flow-injection systems (FIA) are considered as an important tool for methods involving such a strategy because they allow precise control of sample / reagent dispersion and timing. The aim of this work was then to exploit these two favorable aspects for the simultaneous determination of copper and nickel using the 5-Br-PADAP chromogenic reagent. Three sample aliquots were simultaneously inserted by means of a proportional injector into reagent carrier stream (75 mg L-1 5-Br-PADAP + 0.5 mol L-1 acetic acid / acetate, pH 4.7) of a single-line FIA system. During transport towards detection, the established zones coalesce themselves, resulting in a complex zone that was monitored at 562 nm. The local maximum and minimum values of the concentration / time obtained function were considered for multivariate calibration using the PLS-2 (partial least squares - 2) chemometric tool. The reagent concentration, buffering capacity, temperature, flow rate and lengths of the analytical path, sampling loops and initial distance between plugs were established and evaluated for the construction of mathematical models. To this end, 24 Cu2+ and Ni2+ (0.00 - 1.60 mg L-1, also 0.1% v/v HNO3) mixed standard solutions were used. Two latent variables were enough to capture > 98% of the variance inherent in the data set and average prediction errors (RMSEP) were estimated as 0.025 and 0.071 mg L-1 for Cu and Ni, emphasizing the good precision the calibration model. The proposed system presents good figures of merit: physical stability when kept in operation for four uninterrupted hours, consumption of 314 \'mü\'g 5-Br-PADAP per sample, sample throughput of 33 h-1 (165 data, 66 determinations) and error readings in absorbance signals typically <5%. However, inaccuracy of the predictions made by the proposed model when compared to results obtained by ICP OES was noted. Thus, further studies involving this type of matrix, as well as masking techniques of potential interferences present, are recommended

APA, Harvard, Vancouver, ISO, and other styles

45

Feng, Zijie. "Machine learning methods for seasonal allergic rhinitis studies." Thesis, Linköpings universitet, Statistik och maskininlärning, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173090.

Full text

Abstract:

Seasonal allergic rhinitis (SAR) is a disease caused by allergens from both environmental and genetic factors. Some researchers have studied the SAR based on traditional genetic methodologies. As technology develops, a new technique called single-cell RNA sequencing (scRNA-seq) is developed, which can generate high-dimension data. We apply two machine learning (ML) algorithms, random forest (RF) and partial least squares discriminant analysis (PLS-DA), for cell source classification and gene selection based on the SAR scRNA-seq time-series data from three allergic patients and four healthy controls denoised by single-cell variational inference (scVI). We additionally propose a new fitting method consisting of bootstrap and cubic smoothing splines to fit the averaged gene expressions per cell from different populations. To sum up, we find that both RF and PLS-DA could provide high classification accuracy, and RF is more preferable, considering its stable performance and strong gene-selection ability. Based on our analysis, there are 10 genes having discriminatory power to classify cells of allergic patients and healthy controls at any timepoints. Although there is no literature founded to show the direct connections between such 10 genes and SAR, the potential associations are indirectly confirmed by some studies. It shows a possibility that we can alarm allergic patients before a disease outbreak based on their genetic information. Meanwhile, our experiment results indicate that ML algorithms may discover something between genes and SAR compared with traditional techniques, which needs to be analyzed in genetics in the future.

APA, Harvard, Vancouver, ISO, and other styles

46

Lannsjö, Fredrik. "Forecasting the Business Cycle using Partial Least Squares." Thesis, KTH, Matematisk statistik, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-151378.

Full text

Abstract:

Partial Least Squares is both a regression method and a tool for variable selection, that is especially appropriate for models based on numerous (possibly correlated) variables. While being a well established modeling tool in chemometrics, this thesis adapts PLS to financial data to predict the movements of the business cycle represented by the OECD Composite Leading Indicators. High-dimensional data is used, and a model with automated variable selection through a genetic algorithm is developed to forecast different economic regions with good results in out-of-sample tests.
Partial Least Squares är både en regressionsmetod och ett verktyg för variabelselektion som är specielltlämpligt för modeller baserade på en stor mängd (möjligtvis korrelerade) variabler.Medan det är en väletablerad modelleringsmetod inom kemimetri, anpassar den häruppsatsen PLS till finansiell data för att förutspå rörelserna av konjunkturen,representerad av OECD's Composite Leading Indicator. Högdimensionella dataanvänds och en model med automatiserad variabelselektion via en genetiskalgoritm utvecklas för att göra en prognos av olika ekonomiska regioner medgoda resultat i out-of-sample-tester

APA, Harvard, Vancouver, ISO, and other styles

47

Oyedele, Opeoluwa Funmilayo. "The construction of a partial least squares biplot." Doctoral thesis, University of Cape Town, 2014. http://hdl.handle.net/11427/12948.

Full text

Abstract:

Includes bibliographical references.
In multivariate analysis, data matrices are often very large, which sometimes makes it difficult to describe their structure and to make a visual inspection of the relationship between their respective rows (samples) and columns (variables). For this reason, biplots, the joint graphical display of the rows and columns of a data matrix, can be useful tools for analysis. Since they were first introduced, biplots have been employed in a number of multivariate methods, such as Correspondence Analysis (CA), Principal Component Analysis (PCA), Canonical Variate Analysis (CVA) and Discriminant Analysis (DA), as a form of graphical display of data. Another possible employment is in Partial Least Squares (PLS). First introduced as a regression method, PLS is more flexible than multivariate regression, but better suited than Principal Component Regression (PCR) for the prediction of a set of response variables from a large set of predictor variables. Employing the biplot in PLS gave rise to the PLS biplot, a new addition to the biplot family. In the current study, this biplot was successfully applied to the sensory data to investigate the relationships between the sensory panel characteristics and the chemical quality measurements of sixteen olive oils. It was also applied to a large set of mineral sorting production data to investigate the relationships between the output variables and the process factors used to produce a final product. Furthermore, the PLS biplot was applied to a Binomialdistributed data concerning the diabetes testing of Indian women and to a Poisson-distributed data showing the diversity of arboreal marsupials (possum) in the Montane ash forest. After these applications, it is proposed that the PLS biplot is a useful graphical tool for displaying results from the (univariate) Partial Least Squares-Generalized Linear Model (PLS-GLM) analysis of a data set. With Partial Least Squares Regression (PLSR) being a valuable method for modelling high-dimensional data, especially in chemometrics, the PLS biplot was successfully applied to a cereal evaluation containing one hundred and forty five infrared spectra and six chemical properties, and a gene expression data with two thousand genes.

APA, Harvard, Vancouver, ISO, and other styles

48

Manuzon, Michele Yabes. "Investigation of Pseudomonas Biofilm Development and Removal on Dairy Processing Equipment Surfaces Using Fourier Transform Infrared (FT-IR) Spectroscopy." The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1253576498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Rech, André Machado. "Caracterização de bebidas à base de soja empregando espectroscopia no infravermelho médio com transformada de Fourier por reflexão total atenuada e quimiometria." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2018. http://hdl.handle.net/10183/180663.

Full text

Abstract:

Neste trabalho, foram estudadas estratégias para caracterização de bebidas à base de soja (BBS), por meio de análises por espectroscopia no infravermelho médio com transformada de Fourier com acessório de reflexão total atenuada (FTIR-ATR). Foram utilizadas 20 amostras comerciais de BBS, de 7 diferentes sabores e 3 diferentes marcas. Os teores estudados nas BBS foram glicídios totais, glicídios redutores, glicídios não redutores, e proteínas totais. Os modelos de regressão multivariada foram construídos por mínimos quadrados parciais (PLS), empregando como seleção de variáveis os métodos de mínimos quadrados parciais por intervalo (iPLS) e mínimos quadrados parciais por sinergismo de intervalos (siPLS). As seleções de variáveis por siPLS apresentaram os melhores resultados para os modelos construídos. Entre as propriedades avaliadas, a de glicídios totais apresentou modelos com erros de calibração e previsão (RMSECV e RMSEP) baixos, e coeficientes de determinação (R2cv e R2prev) próximos de um. Para proteínas totais, os modelos apresentaram resultados promissores, pois também tiveram erros de calibração e previsão (RMSECV e RMSEP) baixos, e coeficientes de determinação (R2cv e R2prev) próximos de um, considerando-se que as amostras reais e não apresentavam uma variabilidade de concentração de proteínas ideal. Para as propriedades de glicídios redutores e glicídios não redutores, não foram obtidos bons resultados para os modelos de regressão. Desta forma, a metodologia proposta apresenta potencial em análises de rotinas para determinação simultânea de glicídios totais e proteínas, atendendo aos requisitos referente às informações nutricionais na rotulagem das BBS, somando-se às vantagens da espectroscopia no infravermelho, tais como rapidez na análise, elevada frequência analítica, pequena quantidade de amostra necessária, baixo custo, não ser destrutiva e ser ambientalmente amigável.
In this work, strategies were studied for the characterization of soy-based beverages (SBB), by means of Fourier transform infrared spectroscopy with attenuated total reflectance (FTIR-ATR). Twenty commercial samples of SBB were used, of 7 different flavors and 7 different brands. The contents studied in SBB were total sugar, reducing sugar, non-reducing sugars, and total proteins. The multivariate regression models were constructed by partial least squares (PLS), with evaluation of the methods by interval partial least squares (iPLS) and by sinergy interval partial least squares (siPLS), for selection of variables. The selections of variables per siPLS presented the best results for the constructed models. Among the evaluated properties, the total sugar content presented models with low calibration and prediction errors (RMSECV and RMSEP), and determination coefficients (R2cv and R2prev) close to one. For total proteins, the models presented promising results, as they also had low calibration and prediction errors (RMSECV and RMSEP), and determination coefficients (R2cv and R2prev) close to one, considering that the actual samples did not present an ideal protein concentration variability. For the properties of reducing sugars and non-reducing sugars, good results were not obtained for the regression models. In this way, the proposed methodology presents potential in routine analysis for simultaneous determination of total glycogen and protein, taking into account the requirements referring to the nutritional information in the SBB labeling, adding to the advantages of the infrared spectroscopy, such as speed in the analysis, high analytical frequency, small amount of sample required, low cost, non destructive and environmentally friendly.

APA, Harvard, Vancouver, ISO, and other styles

50

Song, Hyojong. "An Exploratory Study of Macro-Social Correlates of Online Property Crime." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6954.

Full text

Abstract:

Despite the recent decreasing trend of most traditional types of crime, online property crime (OPC), referring to crime committed online with a financial orientation such as online frauds, scams, and phishing, continues to increase. According to the Internet Crime Complaint Center, the number of reported complaints about OPC have increased by approximately sixteen fold from 16,838 cases in 2000 to 288,012 cases in 2015, and referred financial losses have also increased about sixty times from $17.8 million in 2001 to $1 billion in 2015. The increase in OPC might be directly related to advanced online accessibility due to the accelerated progress of information and communication technology (ICT). Since the progress of ICT continues forward and the advanced ICT infrastructure can affect our routine activities more significantly, issues regarding OPC may become more various and prevalent. The present study aims to explore a macro-social criminogenic structure of OPC perpetration. Specifically, this study focused on exploring probable macro-social predictors of OPC rates and examining how effectively these possible macro-social predictors account for variance in OPC perpetration rates. In addition, this study explored possible predictors of macro-level online opportunity structure, which is expected to have a direct relationship with OPC rates. It also examined how much variance in online opportunity structure was explained by the included possible predictors. With these research purposes, the current study analyzed state-level data of the fifty states in the U.S. by applying a partial least square regression (PLSR) approach. The results indicated that predictors related to macro-social economic conditions such as economic inequality, poverty, economic social support, and unemployment had a significant association with OPC. As expected, indicators in the domain of economic inequality predicted greater OPC rates and those in the domain of economic social support were related to lower OPC rates. However, poverty and unemployment predictors were negatively associated with OPC, which is the opposite direction of the relationships between these predictors and traditional street crime. In addition, indicators of online opportunity structure were found to have a significantly positive relationship to OPC as expected. The PLSR model for predicting OPC applied in the current study accounted for approximately 50% of variance in OPC rates across states. For predictors of online opportunity structure, the results indicated that online opportunity was associated with state-level economic and socio-demographic characteristics. States with less poverty, more urban population, and more working age adults were more likely to report more online opportunities. The PLSR model for predicting online opportunity structure explained about 80% of variance in measured online opportunity. These results may imply that some types of macro-social conditions may have an indirect effect on OPC through online opportunity structure as well as their direct effects on OPC. Future study should pay more attention to examining structural relationships of macro-social contexts, online opportunity structure, and OPC to understand macro-level criminogenic mechanism of OPC.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Partial least squares analysis'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles