Tesis sobre el tema "Heteroscedastic Multivariate Linear Regression"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Heteroscedastic Multivariate Linear Regression".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Kuljus, Kristi. "Rank Estimation in Elliptical Models : Estimation of Structured Rank Covariance Matrices and Asymptotics for Heteroscedastic Linear Regression". Doctoral thesis, Uppsala universitet, Matematisk statistik, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-9305.
Texto completoBai, Xiuqin. "Robust mixtures of regression models". Diss., Kansas State University, 2014. http://hdl.handle.net/2097/18683.
Texto completoDepartment of Statistics
Kun Chen and Weixin Yao
This proposal contains two projects that are related to robust mixture models. In the robust project, we propose a new robust mixture of regression models (Bai et al., 2012). The existing methods for tting mixture regression models assume a normal distribution for error and then estimate the regression param- eters by the maximum likelihood estimate (MLE). In this project, we demonstrate that the MLE, like the least squares estimate, is sensitive to outliers and heavy-tailed error distributions. We propose a robust estimation procedure and an EM-type algorithm to estimate the mixture regression models. Using a Monte Carlo simulation study, we demonstrate that the proposed new estimation method is robust and works much better than the MLE when there are outliers or the error distribution has heavy tails. In addition, the proposed robust method works comparably to the MLE when there are no outliers and the error is normal. In the second project, we propose a new robust mixture of linear mixed-effects models. The traditional mixture model with multiple linear mixed effects, assuming Gaussian distribution for random and error parts, is sensitive to outliers. We will propose a mixture of multiple linear mixed t-distributions to robustify the estimation procedure. An EM algorithm is provided to and the MLE under the assumption of t- distributions for error terms and random mixed effects. Furthermore, we propose to adaptively choose the degrees of freedom for the t-distribution using profile likelihood. In the simulation study, we demonstrate that our proposed model works comparably to the traditional estimation method when there are no outliers and the errors and random mixed effects are normally distributed, but works much better if there are outliers or the distributions of the errors and random mixed effects have heavy tails.
Solomon, Mary Joanna. "Multivariate Analysis of Korean Pop Music Audio Features". Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1617105874719868.
Texto completoZuber, Verena. "A Multivariate Framework for Variable Selection and Identification of Biomarkers in High-Dimensional Omics Data". Doctoral thesis, Universitätsbibliothek Leipzig, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-101223.
Texto completoMahmoud, Mahmoud A. "The Monitoring of Linear Profiles and the Inertial Properties of Control Charts". Diss., Virginia Tech, 2004. http://hdl.handle.net/10919/29544.
Texto completoPh. D.
Ramaboa, Kutlwano. "Contributions to Linear Regression diagnostics using the singular value decompostion: Measures to Indentify Outlying Observations, Influential Observations and Collinearity in Multivariate Data". Doctoral thesis, University of Cape Town, 2010. http://hdl.handle.net/11427/4391.
Texto completoSouza, Aline Campos Reis de. "Modelos de regressão linear heteroscedásticos com erros t-Student : uma abordagem bayesiana objetiva". Universidade Federal de São Carlos, 2016. https://repositorio.ufscar.br/handle/ufscar/7540.
Texto completoApproved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-27T19:59:56Z (GMT) No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5)
Approved for entry into archive by Marina Freitas (marinapf@ufscar.br) on 2016-09-27T20:00:01Z (GMT) No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5)
Made available in DSpace on 2016-09-27T20:00:08Z (GMT). No. of bitstreams: 1 DissACRS.pdf: 1390452 bytes, checksum: a5365fdbf745228c0174f2643b3f7267 (MD5) Previous issue date: 2016-02-18
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
In this work , we present an extension of the objective bayesian analysis made in Fonseca et al. (2008), based on Je reys priors for linear regression models with Student t errors, for which we consider the heteroscedasticity assumption. We show that the posterior distribution generated by the proposed Je reys prior, is proper. Through simulation study , we analyzed the frequentist properties of the bayesian estimators obtained. Then we tested the robustness of the model through disturbances in the response variable by comparing its performance with those obtained under another prior distributions proposed in the literature. Finally, a real data set is used to analyze the performance of the proposed model . We detected possible in uential points through the Kullback -Leibler divergence measure, and used the selection model criterias EAIC, EBIC, DIC and LPML in order to compare the models.
Neste trabalho, apresentamos uma extensão da análise bayesiana objetiva feita em Fonseca et al. (2008), baseada nas distribuicões a priori de Je reys para o modelo de regressão linear com erros t-Student, para os quais consideramos a suposicão de heteoscedasticidade. Mostramos que a distribuiçãoo a posteriori dos parâmetros do modelo regressão gerada pela distribuição a priori e própria. Através de um estudo de simulação, avaliamos as propriedades frequentistas dos estimadores bayesianos e comparamos os resultados com outras distribuições a priori encontradas na literatura. Além disso, uma análise de diagnóstico baseada na medida de divergência Kullback-Leiber e desenvolvida com analidade de estudar a robustez das estimativas na presença de observações atípicas. Finalmente, um conjunto de dados reais e utilizado para o ajuste do modelo proposto.
Júnior, Antônio Carlos Pacagnella. "A inovação tecnológica nas indústrias do Estado de São Paulo: uma análise dos indicadores da PAEP". Universidade de São Paulo, 2006. http://www.teses.usp.br/teses/disponiveis/96/96132/tde-25072006-151430/.
Texto completoThe technological innovation performs a fundamental part in the development process of companies, regions and even countries. Specifically in the state of São Paulo, the study of relevant aspects to this theme is of summary importance because it is the most industrialized and economically important in this country. Within of this context, this study aim to analyze specifically some aspects linked to the technological innovation in different sections of industrial activity, using to this, technological innovation indicators and business results obtained by the Paulista Research of Economic Activities (PAEP), that was realized by SEADE foundation over the period of 1999 to 2001.
Delmonde, Marcelo Vinicius Felizatti. "Eletro-oxidação oscilatória de moléculas orgânicas pequenas: produção de espécies voláteis e desempenho catalítico". Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/75/75134/tde-19042016-153123/.
Texto completoThe frequent emergence of current/potential oscillations during the electrooxidation of small organic molecules has implications on mechanistic aspects such as, for example, on the overall reaction conversion, and thus on the performance of practical devices of energy conversion. In this direction, this work is divided in two parts: (a) by means of on line Differential Electrochemical Mass Spectrometry (DEMS) it was studied the production of volatile species during the electrooxidation of formic acid, methanol and ethanol. Besides the presentation of previously unreported DEMS results on the oscillatory dynamics of such systems, it was introduced the use of multivariate linear regression to compare the estimated total faradaic current with the one comprising the production of volatile detectable species, namely: carbon dioxide for formic acid, carbon dioxide and methylformate for methanol and, carbon dioxide and acetaldehyde for ethanol. The introduced analysis provided the best combination of the DEMS ion currents to represent the total faradaic current or the maximum possible faradaic contribution of the volatile products for the global current. The results were discussed in connection with mechanistic aspects for each system. The mismatch between estimated total current and the one obtained by the best combination of partial currents of volatile products was found to be small for formic acid, 4 and 5 times bigger for ethanol and methanol, respectively, evidencing the increasing role played by partially oxidized soluble species in each case; (b) it was investigated general features of the electro-oxidation of formaldehyde, formic acid and methanol on platinum and in acid media, with emphasis on the comparison of the performance under stationary and oscillatory regimes. The comparison is carried out by different means and generalized by the use of identical experimental conditions in all cases. In all three systems studied, the occurrence of potential oscillations is associated with excursions of the electrode potentials to lower values, which considerable decreases the overpotential of the anodic reaction, when compared to that in the absence of oscillations. In addition, the reactivation of catalyst surface benefits the performance of all systems in terms of electrocatalytic activity. Finally, some mechanistic aspects of the studied reactions are also discussed.
Maier, Marco J. "DirichletReg: Dirichlet Regression for Compositional Data in R". WU Vienna University of Economics and Business, 2014. http://epub.wu.ac.at/4077/1/Report125.pdf.
Texto completoSeries: Research Report Series / Department of Statistics and Mathematics
Rivers, Derick Lorenzo. "Dynamic Bayesian Approaches to the Statistical Calibration Problem". VCU Scholars Compass, 2014. http://scholarscompass.vcu.edu/etd/3599.
Texto completoZanon, Mattia. "Non-Invasive Continuous Glucose Monitoring: Identification of Models for Multi-Sensor Systems". Doctoral thesis, Università degli studi di Padova, 2013. http://hdl.handle.net/11577/3423010.
Texto completoIl diabete e una malattia che compromette la normale regolazione dei livelli di glucosio nel sangue. Nelle persone diabetiche, il corpo non secerne insulina (diabete di tipo 1) o si vericano delle alterazioni sia nella secrezione che nell'azione dell'insulina stessa (diabete di tipo 2). La terapia si basa principalmente su somministrazione di insulina e farmaci, dieta ed esercizio fisico, modulati in base alla misurazione dei livelli di glucosio nel sangue 3-4 volte al giorno attraverso metodi finger-prick. Nonostante ciò, la concentrazione di glucosio nel sangue supera spesso le soglie di normalita di 70-180 mg/dL. Mentre l'iperglicemia implica complicanze a lungo termine (come ad esempio neuropatia, retinopatia, malattie cardiovascolari e cardiache), l'ipoglicemia puo essere molto pericolosa nel breve termine e, nel peggiore dei casi, portare il paziente in coma ipoglicemico. Nuovi scenari nella cura del diabete si sono affacciati negli ultimi 10 anni, quando sensori per il monitoraggio continuo della glucemia sono entrati nella fase di sperimentazione clinica. Questi sensori sono in grado di monitorare le concentrazioni di glucosio nel sangue con una lettura ogni 1-5 minuti per diversi giorni, permettendo un analisi sia retrospettiva, ad esempio per ottimizzare il controllo metabolico, che in tempo reale, per generare avvisi quando viene predetta l'uscita dalla normale banda euglicemica, e nel cosiddetto "pancreas artificiale". La maggior parte di questi sensori per il monitoraggio continuo della glicemia sono minimatmente invasivi perche sfruttano un piccolo ago inserito sottocute. Gli ultimi anni hanno visto un crescente interesse verso tecnologie non invasive per il monitoraggio continuo della glicemia, con l'obiettivo di migliorare il comfort del paziente. La loro capacità di monitorare i cambiamenti di glucosio nel corpo umano e stata dimostrata in condizioni altamente controllate tipiche di un'infrastruttura clinica. Non appena queste condizioni diventano meno favorevoli (ad esempio durante un uso quotidiano di queste tecnologie), sorgono diversi problemi associati a perturbazioni fisiologiche ed ambientali. Per affrontare questo problema, negli ultimi anni il concetto di "multisensore" ha ottenuto un crescente interesse. Esso consiste nell'integrazione di sensori di diversa natura all'interno dello stesso dispositivo, permettendo la misurazione di fattori endogeni (glucosio, perfusione del sangue, sudorazione, movimento, ecc) ed esogeni (temperatura, umidita, ecc). I segnali maggiormente correlati con il glucosio e quelli legati agli altri processi sono combinati con un opportuno modello matematico con l'obiettivo finale di stimare la glicemia in modo non invasivo. Modelli di sistema (o a "scatola bianca"), nei quali equazioni differenziali descrivono il comportamento interno del sistema, possono essere considerati raramente. Infatti, un modello fisico/meccanicistico legante i dati misurati dal multisensore con il glucosio non e facilmente disponibile. Un differente approccio vede l'impiego di modelli di dati (o a "scatola nera") che descrivono il sistema in esame in termini di ingressi (canali misurati dal dispositivo non invasivo), uscita (valori stimati di glucosio) e funzione di trasferimento (che in questa tesi si limita alla classe dei modelli di regressione lineari multivariati). In fase di identificazione dei parametri del modello potrebbero insorgere problemi numerici legati alla collinearita tra sottoinsiemi dei canali misurati dal multisensore (in particolare per i dispositivi basati su spettroscopia) e per la dimensione potenzialmente elevata dello spazio delle misure. L'obiettivo della tesi di dottorato e di investigare e valutare diverse tecniche per l'identicazione del modello di regressione lineare multivariata con lo scopo di stimare i livelli di glicemia non invasivamente. In particolare, i seguenti metodi sono considerati: Ordinary Least Squares (OLS), Partial Least Squares (PLS), the Least Absolute Shrinkage and Selection Operator (LASSO) basato sulla regolarizzazione con norma l1; Ridge basato sulla regolarizzazione con norma l2; Elastic-Net (EN) basato sulla combinazione delle due norme precedenti. Come caso di studio per l'applicazione delle metodologie proposte, consideriamo i dati misurati dal dispositivo multisensore, principalmente basato su sensori dielettrici ed ottici, sviluppato dall'azienda Solianis Monitoring AG (Zurigo, Svizzera), che ha parzialmente sostenuto gli oneri finanziari legati al progetto di dottorato durante il quale questa tesi e stata sviluppata. La tecnologia del multisensore e la proprietà intellettuale di Solianis sono ora detenute da Biovotion AG (Zurigo, Svizzera). Solianis Monitoring AG ha fornito quarantacinque sessioni sperimentali collezionate da 6 pazienti soggetti a protocolli ipo ed iperglicemici presso l'University Hospital Zurich. I modelli identificati con le tecniche di cui sopra, sono testati con un insieme di dati diverso da quello utilizzato per l'identicazione dei modelli stessi. I risultati dimostrano chei metodi di controllo della complessita hanno accuratezza maggiore rispetto ad OLS. In generale, le tecniche basate su regolarizzazione sono migliori rispetto a PLS. In particolare, quelle che sfruttano la norma l1 (LASSO ed EN), pongono molti coefficienti del modello a zero rendendo i profili stimati di glucosio piu robusti a rumore occasionale che interessa alcuni canali del multi-sensore. In particolare, il modello EN risulta il migliore, condividendo sia le proprietà di sparsita e l'effetto raggruppamento indotte rispettivamente dalle norme l1 ed l2. In generale, i risultati indicano che, anche se le prestazioni, in termini di accuratezza dei profili di glucosio stimati, non sono ancora confrontabili con quelle dei sensori basati su aghi, la piattaforma multisensore combinata con il modello EN è un valido strumento per il monitoraggio in tempo reale dei trend glicemici. Una possibile applicazione si basa sull'utilizzo del'informazione dei trend glicemici per completare misure rade effettuate con metodi finger-prick. Sfruttando il concetto di rischio dinamico recentemente sviluppato, e' possibile dare una corretta valutazione di eventi potenzialmente pericolosi come l'ipoglicemia. La tesi si articola in tre parti principali: Parte I (che comprende i Capitoli 1-4), fornisce inizialmente un'introduzione sul diabete, una recensione delle attuali tecnologie per il monitoraggio non-invasivo della glicemia (incluso il dispositivo multisensore di Solianis) e gli obiettivi della tesi; Parte II (che comprende i Capitoli 5-9), presenta alcune delle difficoltà affrontate quando si lavora con problemi di regressione su dati di grandi dimensioni, per poi presentare OLS, PLS, LASSO, Ridge e EN sfruttando un esempio tutorial per evidenziarne vantaggi e svantaggi. Infine, Parte III, (Capitoli 10-12) presenta il set di dati del caso di studio ed i risultati. Alcune note conclusive e possibili sviluppi futuri terminano la tesi. In particolare, vengono brevemente illustrate una metodologia basata su simulazioni Monte Carlo per valutare la robustezza della calibrazione del modello e l'utilizzo di un nuova nuova funzione obiettivo per l'identicazione dei modelli.
COSTA, Ismael Gaião da. "Desempenho agroindustrial, adaptabilidade, estabilidade e divergência genética entre clones RB de cana-de-açúcar em Pernambuco". Universidade Federal Rural de Pernambuco, 2011. http://www.tede2.ufrpe.br:8080/tede2/handle/tede2/6411.
Texto completoMade available in DSpace on 2017-02-17T11:47:54Z (GMT). No. of bitstreams: 1 Ismael Gaiao da Costa.pdf: 2381457 bytes, checksum: 1ddfca5789915a115c0404f1571561e5 (MD5) Previous issue date: 2011-02-24
Brazil is the world's largest producer of sugarcane (Saccharum spp.), whose culture interacts with the most varied environments. The replacement of varieties has contributed greatly to an effective increase in productivity. Thus it studies of genotype x environment (G x E) interaction, the analysis of phenotypic adaptability and stability, and the selection of parents for hybridization are essential for the indication of varieties suited to different soil and climatic conditions. The objective of this research was to evaluate the agribusiness behavior, adaptability and phenotypic stability of 11 RB sugarcane clones in the final phase of the trial, in sugarcane micro regions in the State of Pernambuco, Brazil Northeast, for three consecutives harvests, as well as assisting the selection of potential parents to be used in future crossings by conducted by Sugarcane Breeding Program (PMGCA) of Network for the Development of Alcohol and Sugar (RIDESA) of Experimental Station Sugarcane Carpina (EECAC) of Federal Rural University of Pernambuco (UFRPE). The experiments were carried out in five Pernambuco sugar mills, in the months of july and august 2006, using the experimental design of randomized blocks with four replications and plots with five eight-meter furrows and spacing of 1.0 m. The results were subjected to analysis of variance, comparison of averages by Scott & Knott test and studies of adaptability, stability and genetic divergence. In each section the variables were measured as ton of pol per hectare (TPH), ton of cane per hectare (TCH); fibre (FIB), Pol% corrected (PCC), purity (PZA), soluble solids (BRIX) and total recoverable sugar (TRS). Based on the results, the best RB genotypes of sugarcane were G1, G6 and G9 in environment I, G1 and G11 in environment II, G1 and G9 in environment III, G3 for environment IV and G1 the environment V. Among the best clones, those with wide adaptability are: G1 and G11, and those with adaptability to environments are: G6 and G9. The genotypes most indicated for use in hybridizations are G1 and G6, as they showed the greatest genetic dissimilarity.
O Brasil é o maior produtor mundial de cana-de-açúcar (Saccharum spp.), cuja cultura interage com os mais variados ambientes. A substituição de variedades tem contribuído bastante para um eficiente aumento na produtividade. Neste sentido, os estudos da interação genótipo x ambiente (G x A), as análises de adaptabilidade e estabilidade fenotípica, e a seleção de parentais para cruzamentos são imprescindíveis para a indicação de variedades adequadas às diversas condições edafoclimáticas. Objetivou-se com esta pesquisa avaliar o comportamento agroindustrial, a adaptabilidade e a estabilidade fenotípica de 11 clones RB de cana-de-açúcar, na fase final da experimentação, em microrregiões canavieiras do Estado de Pernambuco, por três colheitas consecutivas, bem como auxiliar a seleção de progenitores potenciais a serem utilizados em futuros cruzamentos pelo Programa de Melhoramento Genético da Cana-de-açúcar (PMGCA) da Rede Interuniversitária para o Desenvolvimento do Setor Sucroalcooleiro (RIDESA) conduzido pela Estação Experimental de Cana-de-açúcar (EECAC) da Universidade Federal Rural de Pernambuco (UFRPE). Os experimentos foram instalados em cinco usinas de Pernambuco, nos meses de julho e agosto de 2006, utilizando-se o delineamento experimental de blocos casualizados, com quatro repetições e parcelas com cinco sulcos de oito metros com espaçamento de 1,0 m. Os resultados foram submetidos à análise de variância, à comparação de médias pelo teste de Scott & Knott e a estudos de adaptabilidade, estabilidade e divergência genética. Em cada corte foram mensuradas as variáveis tonelada de pol por hectare (TPH), tonelada de cana por hectare (TCH); Pol% corrigido (PCC), fibra (FIB), pureza (PZA), teor de sólidos solúveis (BRIX) e açúcar total recuperável (ATR). Com base nos resultados, os genótipos RB de cana-de-açúcar mais produtivos foram G1, G6 e G9; para o ambiente I, G11 e G1 para o ambiente II, G9 e G1 para o ambiente III, G3 para o ambiente IV e G1 para o ambiente V. Dentre os melhores clones, aqueles com adaptabilidade ampla são: G1 e G11; e aqueles com adaptabilidade para ambientes favoráveis são: G6 e G9. Os genótipos mais indicados para utilização em hibridações são G1 e G6, pois estes apresentaram a maior dissimilaridade genética.
Martins, Natália da Silva. "Método Shenon (Shelf-life prediction for Non-accelarated Studies) na predição do tempo de vida útil de alimentos". Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-05012017-182429/.
Texto completoConsumers are increasingly demanding about the quality of food and expectation that this quality is maintained at high level during the period between purchase and consumption. These expectations are a consequence not only of the requirement that the food should stay safe, but also the need to minimize the unwanted changes in their sensory qualities. Considering food safety and consumer demands this study aims to propose a multivariate statistical method to predict the shelf life of time not accelerated studies, the method SheNon. The development of multivariate method for predicting the shelf life of a food, considering all attributes and their natures describes a new concept of data analysis for estimating the degradation mechanisms that govern food and determines the time period in which these foods retain their characteristics within acceptable levels. The proposed method allows to include microbiological, physical, chemical and sensory attributes, which leads to a more accurate prediction of shelf life of food. The method SheNon features easy interpretation, its main advantages include the ability to combine information from different natures and can be generalized to data with experimental structure. The method SheNon was applied to eggplants minimally processed predicting a lifetime of around 9.6 days.
Lazar, Ann A. "Determining when time response curves differ in the presence of censorship /". Connect to abstract via ProQuest. Full text is not available online, 2008.
Buscar texto completoHosler, Deborah Susan. "Models and Graphics in the Analysis of Categorical Variables: The Case of the Youth Tobacco Survey". [Johnson City, Tenn. : East Tennessee State University], 2002. http://etd-submit.etsu.edu/etd/theses/available/etd-0716102-095453/unrestricted/HoslerD080202.pdf.
Texto completoFausti, Giovanni, Gustaf Sandelin y Adam Bratt. "Stock Splits And The Impact On Abnormal Return : A Quantitative Research on Nasdaq Stockholm". Thesis, Stockholms universitet, Företagsekonomiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-194741.
Texto completoSousa, Rhelcris Salvino de. "Algoritmo evolutivo com representação inteira para seleção de características". Universidade Federal de Goiás, 2017. http://repositorio.bc.ufg.br/tede/handle/tede/7395.
Texto completoApproved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2017-06-01T11:00:44Z (GMT) No. of bitstreams: 2 Dissertação - Rhelcris Salvino de Sousa -2017.pdf: 12280322 bytes, checksum: 2985f69ec9d4b79ed4266baba761bd15 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5)
Made available in DSpace on 2017-06-01T11:00:44Z (GMT). No. of bitstreams: 2 Dissertação - Rhelcris Salvino de Sousa -2017.pdf: 12280322 bytes, checksum: 2985f69ec9d4b79ed4266baba761bd15 (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2017-04-20
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
Machine learning problems usually involve a large number of features or variables. In this context, feature selection algorithms have the challenge of determining a reduced subset from the original set. The main difficulty in this task is the high number of solutions available in the search space. In this context, genetic algorithm is one of the most used techniques in this type of problem due to its implicit parallelism in the exploration of the search space of the problem considered. However, a binary type representation is usually used to encode the solutions. This work proposes an implementation solution that makes use of integer representation called intEA-MLR instead of binary. The integer representation optimizes the understanding of the data, as the features to be selected are represented by integer values, reducing the size of the chromosome used in the search process. The intEA-MLR in this context is presented as an alternative way of solving high dimensional problems in regression problems. As a case study, three different sets of data are used concerning problems involving determination of properties of interest in samples of 1) Grain Wheat, 2) Medicine tablets and 3) petroleum. Such sets were used in competitions held at the International Diffuse Reflectance Conference (IDRC) (http://cnirs.clubexpress.com/content.aspx?page_id=22&club_ id=409746&module_id=190211), in the years 2008, 2012 and 2014, respectively. The results showed that the proposed solution was able to improve the obtained solutions when compared to the classical implementation that makes use of binary coding, with both more accurate prediction models and with reduced number of features. IntEA-MLR also outperformed the competition winners, reaching 91.17% better than the competition winner for the petroleum data set. In addition, the results also indicated that the computation time required by the intEA-MLR is relatively smaller as more features are available.
Problemas de aprendizado de máquina geralmente envolvem um grande número de características ou variáveis. Nesse contexto, algoritmos de seleção de características tem como desafio determinar um subconjunto reduzido a partir do conjunto original. A principal dificuldade nesta tarefa é o elevado número de soluções disponíveis no espaço de busca. Nesse contexto, algoritmo genético é uma das técnicas mais utilizadas nesse tipo de problema em razão de seu paralelismo implícito na exploração do espaço de busca do problema considerado. Entretanto, geralmente utiliza-se uma representação do tipo biná- ria para codificar as soluções. Neste trabalho é proposto uma solução de implementação que faz uso de representação inteira denominada intEA-MLR em detrimento da binária. A representação inteira otimiza o entendimento dos dados, na medida em que as características a serem selecionadas são determinadas por valores inteiros reduzindo o tamanho do cromossomo utilizado no processo de busca. O intEA-MLR nesse contexto, se apresenta como uma forma alternativa de resolução de problemas de alta dimensionalidade em problemas de regressão. Como estudo de caso, utiliza-se três diferentes conjuntos de dados referente a problemas envolvendo determinação de propriedades de interesse em amostra de 1) Grãos de Trigo, 2) Comprimidos de remédio e 3) Petróleo. Tais conjuntos foram utilizados nas competições realizadas no International Diffuse Reflectance Conference (IDRC) (http://cnirs.clubexpress.com/content.aspx?page_id=22&club_ id=409746&module_id=190211), nos anos de 2008, 2012 e 2014, respectivamente. Os resultados mostraram que a solução proposta foi capaz de aprimorar as soluções obtidas quando comparadas com a implementação clássica que faz uso da codificação binária, tanto com modelos de predição mais acurados quanto com número reduzido de características. intEA-MLR também obteve resultados superiores aos dos vencedores das competições, chegando a obter soluções 91,17% melhores do que o vencedor da competição para o conjunto de dados de petróleo. Adicionalmente, os resultados também indicaram que o tempo de computação requerido pelo intEA-MLR é relativamente menor a medida em que um número maior de características estão disponíveis.
Dieste, Andrés. "Colour development in Pinus radiata D. Don. under kiln-drying conditions". Thesis, University of Canterbury. Chemical and Process Engineering, 2002. http://hdl.handle.net/10092/1134.
Texto completoSoares, Sófacles Figueiredo Carreiro. "Um novo critério para seleção de variáveis usando o Algoritmo das Projeções Sucessivas". Universidade Federal da Paraíba, 2010. http://tede.biblioteca.ufpb.br:8080/handle/tede/7184.
Texto completoCoordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
This study proposes a modification in the Successive Projections Algorithm (SPA), that makes models of Multiple Linear Regression (MLR) more robust in terms of interference. In SPA, subsets of variables are compared based on their root mean square errors for the validation set. By taking into account the statistical prediction error obtained for the calibration set, and dividing by the statistical prediction error obtained for the prediction set, SPA can be improved. Also taken into account is the leverage associated with each sample. Three case studies involving; simulated analytic determinations, food colorants (UV-VIS spectrometry), and ethanol in gasoline (NIR spectrometry) are discussed. The results were evaluated using the root mean square error for an independent prediction set (Root Mean Square Error of Prediction - RMSEP), graphs of the variables, and the statistical tests t and F. The MLR models obtained by the selection using the new function were called SPE-SPA-MLR. When an interferent was present in the prediction spectra, almost all of the models performed better than both SPA-MLR and PLS. The models when compared to SPA-MLR showed that the change promoted better models in all cases giving smaller RMSEPs and variable numbers. The SPE-SPA-MLR was not better in some cases, than PLS models. The variables selected by SPA-SPE-MLR when observed in the spectra were detected in regions where interference was the at its smallest, revealing great potential. The modifications presented here make a useful tool for the basic formulation of the SPA.
Este trabalho propõe uma modificação no Algoritmo das Projeções Sucessivas (Sucessive Projection Algorithm - SPA), com objetivo de aumentar a robustez a interferentes nos modelos de Regressão Linear Múltipla (Multiple Linear Regression - MLR) construídos. Na formulação original do SPA, subconjuntos de variáveis são comparados entre si com base na raiz do erro quadrático médio obtido em um conjunto de validação. De acordo com o critério aqui proposto, a comparação é feita também levando em conta o erro estatístico de previsão (Statistical Prediction Error SPE) obtido para o conjunto de calibração dividido pelo erro estatístico de previsão obtido para o conjunto de previsão. Tal métrica leva em conta a leverage associada a cada amostra. Três estudos de caso envolvendo a determinação de analitos simulados, corantes alimentícios por espectrometria UV-VIS e álcool em gasolinas por espectrometria NIR são discutidos. Os resultados são avaliados em termos da raiz do erro quadrático médio em um conjunto de previsão independente (Root Mean Square Error of Prediction - RMSEP), dos gráficos das variáveis selecionadas e através do testes estatísticos t e F. Os modelos MLR obtidos a partir da seleção usando a nova função custo foram chamados aqui de SPA-SPE-MLR. Estes modelos foram comparados com o SPA-MLR e PLS. Os desempenhos de previsão do SPA-SPEMLR apresentados foram melhores em quase todos os modelos construídos quando algum interferente estava presente nos espectros de previsão. Estes modelos quando comparados ao SPA-MLR, revelou que a mudança promoveu melhorias em todos os casos fornecendo RMSEPs e números de variáveis menores. O SPA-SPE-MLR só não foi melhor que alguns modelos PLS. As variáveis selecionadas pelo SPA-SPE-MLR quando observadas nos espectros se mostraram em regiões onde a ação do interferente foi à menor possível revelando o grande potencial que tal mudança provocou. Desta forma a modificação aqui apresentada pode ser considerada como uma ferramenta útil para a formulação básica do SPA.
Olid, Pilar. "Making Models with Bayes". CSUSB ScholarWorks, 2017. https://scholarworks.lib.csusb.edu/etd/593.
Texto completoD’ávila, Rodrigo Souza. "APLICAÇÃO DE REGRESSÃO LINEAR MÚLTIPLA NA ANÁLISE DA DINÂMICA DE CÁTIONS TROCÁVEIS EM UM SISTEMA SOLO-PLANTA IRRIGADO COM ÁGUA RESIDUÁRIA". UNIVERSIDADE ESTADUAL DE PONTA GROSSA, 2013. http://tede2.uepg.br/jspui/handle/prefix/123.
Texto completoCoordenação de Aperfeiçoamento de Pessoal de Nível Superior
The competition of water in different regions of the world, between agriculture and the human needs, has led to restrictions in the increase of food production, resulting in search for alternative sources. The use of effluent from secondary treatment of sewage (ETSE) has been a common practice in several seasonal situations. The aims of this work were: (i) create regression models to assist in the understanding of the dynamics of acidity (current, exchangeable and total), the exchangeable bases and the exchangeable sodium percentage (ESP) in the soil, through the use of multiple linear regression (RLM), considering variables of soil, soil solution, plant, ETSE, weather and complementary variables, and (ii) compare the generated models with the standard method and the models generated from selecting variables. For the construction of the MLR models, the method of stepwise variable selection, forward and backward were used and compared with the standard method through the index adjusted determination coefficient (R2adj) and the variance inflation factor (VIF). The models developed from the method of variables selection were the most indicated. All the attributes in the scenarios and layers of the studied soils were not explained by the same group of variables. In general the results were consistent as far as the pH increased, the H + Al (total acidity) and Al (potential acidity) concentration decreased and Ca (calcium), Mg (magnesium) were increased. Because of the low-K (potassium) in the soil, the contribution of this nutrient by irrigation with ETSE cause little influence in the concentrations of this element. Due to the high sodium absorption ratio (SAR) in the effluent concentrations of this element, as well as PST were increased over time in soil. The accumulation and export of Na (sodium) by plants was not sufficient to prevent the increase in the concentrations of exchangeable Na and ESP in all studied scenarios and layers.
A concorrência de água entre o setor agrícola e as necessidades humanas em diversas regiões do mundo tem ocasionado restrições no incremento da produção de alimentos, implicando em buscas por fontes alternativas. A utilização de efluente de tratamento secundário de esgoto (ETSE) tem sido uma prática comum em várias situações sazonais. Objetivou-se neste trabalho:(i) criar modelos de regressão para auxiliar no entendimento da dinâmica da acidez (trocável e total), bases trocáveis e percentual de sódio trocável (PST) no solo, através do uso de regressão linear múltipla (RLM), considerando variáveis de solo, solução no solo, planta, ETSE, meteorológicas e variáveis complementares; e (ii) comparar os modelos gerados com método padrão e os modelos gerados com seleção de variáveis. Para construção dos modelos de RLM foram utilizados o método de seleção de variáveis stepwise, forward e backward e comparados com o método padrão, através dos índices de coeficiente de determinação ajustado (R2adj) e do fator de inflação de variância (FIV). Os modelos desenvolvidos a partir do método de seleção de variáveis foram os mais indicados. Todos os atributos nos cenários e camadas de solos estudados não foram explicadas por um mesmo grupo de variáveis. De modo geral, os resultados foram coerentes, pois na medida em que o pH aumentou, as concentrações H+Al e Al diminuíram e as de Ca e Mg foram incrementadas. O baixo teor de K no solo, evidenciou que o aporte desse nutriente pela irrigação com ETSE pouco influência as concentrações desse elemento. Devido à alta razão de adsorção de sódio (RAS) no ETSE as concentrações deste elemento, bem como PST foram aumentadas ao longo do tempo no solo. O acúmulo e a exportação de Na pelas plantas não foi suficiente para evitar o incremento nas concentrações de Na trocável e PST em todos os cenários e camadas estudados.
Salawu, Emmanuel Oluwatobi. "Spatiotemporal Variations in Coexisting Multiple Causes of Death and the Associated Factors". ScholarWorks, 2018. https://scholarworks.waldenu.edu/dissertations/6108.
Texto completoFranksson, Rikard. "Private Equity Portfolio Management and Positive Alphas". Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-275666.
Texto completoDet här projektet analyserar nordiska bolag aktiva inom Informations- och Kommunikationsteknologi (ICT) i två delar. Del I behandlar analys av publika bolag för att konstruera en värderingsmodell avsedd att förutsäga privata bolags enterprise value. Del II analyserar privata bolag för att undersöka huruvida det finns möjligheter att uppnå överavkastning jämfört med investeringar i publika bolag. I del I utnyttjas multipel regressionsanalys för att identifiera tillämpliga värderingsmodeller. I den processen påvisas att modeller med enbart en faktor ger bäst statistiska resultat i fråga om signifikans och förutsägelsefel. I fallande ordning, med avseende på precision i förutsägelser, är dessa modeller (1) totala tillgångar, (2) omsättning, (3) EBITDA, och (4) kassaflöde. Del II använder modell (1) och finner att den nordiska marknaden för privata ICT-bolag erbjuder möjligheter för överavkastning jämfört med motsvarande publika marknad, samt att det är möjligt att konstruera portföljstrategier som ökar avkastningen ytterligare. Dock, med hänsyn till tidigare forskning, verkar det som att de möjligheter för avkastning som går att finna på marknaden av privata bolag som undersökts inte kompenserar investerare tillräckligt för de ytterligare risker som är relaterade till investeringar i privata bolag.
Paula, Lauro Cássio Martins de. "Paralelização de algoritmos APS e Firefly para seleção de variáveis em problemas de calibração multivariada". Universidade Federal de Goiás, 2014. http://repositorio.bc.ufg.br/tede/handle/tede/3418.
Texto completoApproved for entry into archive by Jaqueline Silva (jtas29@gmail.com) on 2014-10-21T18:37:00Z (GMT) No. of bitstreams: 2 Dissertação - Lauro Cássio Martins de Paula - 2014.pdf: 2690755 bytes, checksum: 3f2c0a7c51abbf9cd88f38ffbe54bb67 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Made available in DSpace on 2014-10-21T18:37:00Z (GMT). No. of bitstreams: 2 Dissertação - Lauro Cássio Martins de Paula - 2014.pdf: 2690755 bytes, checksum: 3f2c0a7c51abbf9cd88f38ffbe54bb67 (MD5) license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5) Previous issue date: 2014-07-15
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES
The problem of variable selection is the selection of attributes for a given sample that best contribute to the prediction of the property of interest. Traditional algorithms as Successive Projections Algorithm (APS) have been quite used for variable selection in multivariate calibration problems. Among the bio-inspired algorithms, we note that the Firefly Algorithm (AF) is a newly proposed method with potential application in several real world problems such as variable selection problem. The main drawback of these tasks lies in them computation burden, as they grow with the number of variables available. The recent improvements of Graphics Processing Units (GPU) provides to the algorithms a powerful processing platform. Thus, the use of GPUs often becomes necessary to reduce the computation time of the algorithms. In this context, this work proposes a GPU-based AF (AF-RLM) for variable selection using multiple linear regression models (RLM). Furthermore, we present two APS implementations, one using RLM (APSRLM) and the other sequential regressions (APS-RS). Such implementations are aimed at improving the computational efficiency of the algorithms. The advantages of the parallel implementations are demonstrated in an example involving a large number of variables. In such example, gains of speedup were obtained. Additionally we perform a comparison of AF-RLM with APS-RLM and APS-RS. Based on the results obtained we show that the AF-RLM may be a relevant contribution for the variable selection problem.
O problema de seleção de variáveis consiste na seleção de atributos de uma determinada amostra que melhor contribuem para a predição da propriedade de interesse. O Algoritmo das Projeções Sucessivas (APS) tem sido bastante utilizado para seleção de variáveis em problemas de calibração multivariada. Entre os algoritmos bioinspirados, nota-se que o Algoritmo Fire f ly (AF) é um novo método proposto com potencial de aplicação em vários problemas do mundo real, tais como problemas de seleção de variáveis. A principal desvantagem desses dois algoritmos encontra-se em suas cargas computacionais, conforme seu tamanho aumenta com o número de variáveis. Os avanços recentes das Graphics Processing Units (GPUs) têm fornecido para os algoritmos uma poderosa plataforma de processamento e, com isso, sua utilização torna-se muitas vezes indispensável para a redução do tempo computacional. Nesse contexto, este trabalho propõe uma implementação paralela em GPU de um AF (AF-RLM) para seleção de variáveis usando modelos de Regressão Linear Múltipla (RLM). Além disso, apresenta-se duas implementações do APS, uma utilizando RLM (APS-RLM) e uma outra que utiliza a estratégia de Regressões Sequenciais (APS-RS). Tais implementações visam melhorar a eficiência computacional dos algoritmos. As vantagens das implementações paralelas são demonstradas em um exemplo envolvendo um número relativamente grande de variáveis. Em tal exemplo, ganhos de speedup foram obtidos. Adicionalmente, realiza-se uma comparação do AF-RLM com o APS-RLM e APS-RS. Com base nos resultados obtidos, mostra-se que o AF-RLM pode ser uma contribuição relevante para o problema de seleção de variáveis.
Louredo, Graciliano Márcio Santos. "Estimação via EM e diagnóstico em modelos misturas assimétricas com regressão". Universidade Federal de Juiz de Fora (UFJF), 2018. https://repositorio.ufjf.br/jspui/handle/ufjf/6662.
Texto completoApproved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2018-04-11T15:25:36Z (GMT) No. of bitstreams: 1 gracilianomarciosantoslouredo.pdf: 1813142 bytes, checksum: b79d02006212c4f63d6836c9a417d4bc (MD5)
Made available in DSpace on 2018-04-11T15:25:36Z (GMT). No. of bitstreams: 1 gracilianomarciosantoslouredo.pdf: 1813142 bytes, checksum: b79d02006212c4f63d6836c9a417d4bc (MD5) Previous issue date: 2018-02-26
FAPEMIG - Fundação de Amparo à Pesquisa do Estado de Minas Gerais
O objetivo deste trabalho é apresentar algumas contribuições para a melhoria do processo de estimação por máxima verossimilhança via algoritmo EM em modelos misturas assimétricas com regressão, além de realizar neles a análise de influência local e global. Essas contribuições, em geral de natureza computacional, visam à resolução de problemas comuns na modelagem estatística de maneira mais eficiente. Dentre elas está a substituição de métodos utilizados nas versões dos algoritmos GEM por outras que reduzem o problema aproximadamente a um algoritmo EM clássico nos principais exemplos das distribuições misturas de escala assimétricas de normais. Após a execução do processo de estimação, discutiremos ainda as principais técnicas existentes para o diagnóstico de pontos influentes com as adaptações necessárias aos modelos em foco. Desejamos com tal abordagem acrescentar ao tratamento dessa classe de modelos estatísticos a análise de regressão nas distribuições mais recentes na literatura. Também esperamos abrir caminho para o uso de técnicas similares em outras classes de modelos.
The objective of this work is to present some contributions to improvement the process of maximum likelihood estimation via the EM algorithm in skew mixtures models with regression, as well as to execute in them the global and local influence analysis. These contributions, usually with computational nature, aim to solving common problems in statistical modeling more efficiently. Among them is the replacement of used methods in the versions of the GEM algorithm by other techniques that reduce the problem approximately to a classic EM algorithm in the main examples of skew scale mixtures of normals distributions. After performing the estimation process, we will also discuss the main existing techniques for the diagnosis of influential points with the necessaries adaptations to the models in focus. We wish with this approach to add for the treatment of this statistical model class the regression analysis in the most recent distributions in the literature. We too hope to paving the way for use of similar techniques in other models classes.
Kuhnert, Petra Meta. "New methodology and comparisons for the analysis of binary data using Bayesian and tree based methods". Thesis, Queensland University of Technology, 2003.
Buscar texto completoNahangi, Arian A. "Modeling and Solving the Outsourcing Risk Management Problem in Multi-Echelon Supply Chains". DigitalCommons@CalPoly, 2021. https://digitalcommons.calpoly.edu/theses/2321.
Texto completoTomek, Peter. "Approximation of Terrain Data Utilizing Splines". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236488.
Texto completoOstrowska, Alicja. "War is Peace : A Study of Relationship Between Gender Equality and Peacefulness of a State". Thesis, Högskolan för lärande och kommunikation, Högskolan i Jönköping, HLK, Globala studier, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-27663.
Texto completoKossaï, Mohamed. "Les Technologies de L’Information et des Communications (TIC), le capital humain, les changements organisationnels et la performance des PME manufacturières". Thesis, Paris 9, 2013. http://www.theses.fr/2013PA090035/document.
Texto completoICT is a key performance factor in developed countries. This PhD thesis focuses on the adoption of ICTs and their impact on the performance of manufacturing SMEs in a developing country. Following a first part covering the theoretical and conceptual framework, the rest of the thesis is organized in three empirical studies. The first study uses a Probit model in order to identify the determinants of ICT adoption. Human capital seems to be the most significant explanatory variable. Based on linear regression of dummy variables, Granger causality, Kruskal-Wallis test, ANOVA test of Welch, followed by corresponding post-hoc tests, the second study highlights the existence of a strong statistically significant relationship between the level of ICT adoption and profitability. In a third study, many Probit models (simple, ordered and multivariate) were tested on different measures of performance. Firstly, we show that ICT have a positive impact on productivity, profitability and competitiveness of SMEs. Secondly, ICT, human capital and training are determinants of firm overall performance. Thirdly, when combined together, ICT and highly skilled human resources have an important contribution to the global performance. In conclusion, our empirical results demonstrate a positive impact of ICT, human capital and organizational change on firm performance
Assareh, Hassan. "Bayesian hierarchical models in statistical quality control methods to improve healthcare in hospitals". Thesis, Queensland University of Technology, 2012. https://eprints.qut.edu.au/53342/1/Hassan_Assareh_Thesis.pdf.
Texto completoCOLPO, MARCO. "The relationship between food intake and depressive symptoms". Doctoral thesis, 2018. https://hdl.handle.net/2158/1288738.
Texto completo王仁聖. "Generalized inference in heteroscedastic multivariate linear models". Thesis, 2008. http://ndltd.ncl.edu.tw/handle/90735848617742765093.
Texto completo國立交通大學
統計學研究所
96
Our main subject in this dissertation is applying the generalized method to deal with regression model with heteroscedastic AR(1) covariance matrices. The concepts of the generalized p-values and the generalized confidence intervals proposed by Tsui and Weerahandi (1989) and Weerahandi (1993), respectively, provide an alternative way to handle with heteroscedasticity. We extend these concepts to further consider the standardized expression of the generalized multivariate test variable. Lin and Lee (2003) applied the generalized method to deal with the MANOVA model with unequal uniform covariance structures among multiple groups. We utilize their process with modifications to deal with regression model with heteroscedastic serial dependence. The coverage probabilities and expected areas based on our proposed procedure display satisfactory results. Besides, we also find that our method can be applied to the uniform structures without the special design matrices assumption.
Huang, Min-Chia y 黃敏嘉. "Multivariate Function-on-Function Linear Regression". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/78499496560023216042.
Texto completo國立中興大學
統計學研究所
105
Functional linear regression is an important tool to analyze longitudinal data. In longitudinal data analysis, the observations are made on irregular time points (or locations) with measurement error. Moreover, observations of the same subject are correlated. Our method is suitable for the mentioned situations. Our method aims at improving function-on-function linear regression. Traditionally, the regression coefficients for function-on-function linear regression models are estimated by the first few principal components of both the predictor and response functions. However, some useful information might be treated as error term and thus be discarded if we just adapt the first few important principal components. Consequently, we resolve this issue by estimating the coefficients directly from the covariance functions of predictor and response functions. The proposed estimation approach can be used in multiple and multidimensional function-on-function linear regression models as well.
Yeh, Shing-Hung y 葉世弘. "Adaptive Group Lasso for Multivariate Linear Regression". Thesis, 2009. http://ndltd.ncl.edu.tw/handle/90910161360611684952.
Texto completo國立成功大學
統計學系碩博士班
97
In traditional statistical method, estimation and variable selection are almost discussed separately. LASSO (Tibshirani, 1996) is a new method for estimation in linear model, it can estimate parameters and variable selection simultaneously. But Lasso is inconsistent for variable selection, Adaptive Lasso (Zou 2006) overcomes these problems and enjoys the oracle properties. In linear regression when categorical predictors (factors) are present, the Lasso solution only selects individual dummy variables instead of whole factors. The group Lasso(Yuan and Lin 2006) overcomes these problems. Group lasso is a natural extension of lasso and selects variable in a grouped manner, group lasso suffers from estimation inefficiency and selection inconsistency. Adaptive Group Lasso (Wang and Leng 2006) show it’s estimator can be as efficient as oracle. We propose the adaptive group lasso for multivariate linear regression. In our study, the definition of grouped variable is different with the definition defined by formed study, which is regard one column of model matrix as a group. We consider one row of parametric matrix as one group for finding the significant variable on Y.
Lin, Jin-Sying. "Linear regression analysis for multivariate failure time observations". 1991. http://catalog.hathitrust.org/api/volumes/oclc/26228833.html.
Texto completoTypescript. Vita. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 88-93).
Chen, Lianfu. "Topics on Regularization of Parameters in Multivariate Linear Regression". Thesis, 2011. http://hdl.handle.net/1969.1/ETD-TAMU-2011-12-10644.
Texto completoLin, Lung-Shun y 林隆舜. "Multivariate Linear Regression Models with Censored and Missing Responses". Thesis, 2016. http://ndltd.ncl.edu.tw/handle/36387812662217809093.
Texto completo逢甲大學
統計學系統計與精算碩士班
104
During the past few decades, statistical methods for continuous longitudinal data, which are repeatedly collected on each subject over a period of time, have received considerable attention via in the literature, especially in biomedical studies and clinical trials. In longitudinal research, missing data occur frequently due to many reasons, such as missed visits, withdrawal from a study, loss to follow-up, and so on. Besides, left and/or right censored observations, which are not exactly quantified, exist in the data due to certain lower and/or upper detection limits. For analyzing longitudinal data with missing values and censored responses simultaneously, this thesis proposes the multivariate linear regression model with censored and missing responses (MLRCM). The MLRCM approach includes the multivariate linear regression with censored responses (MLRC), multivariate linear regression with missing responses (MLRM) and multivariate linear regression (MLR) as special cases, which are also discussed in this thesis. A computational flexible expectation conditional maximization (ECM) algorithm is provided to carry out maximum likelihood estimation of model parameters. The standard errors of estimates of regression coefficients are calculated by a information-based method. A series of simulation studies are conducted to examine the finite sample property of the proposed model. We illustrate our methodology with a real-data example.
Lee, Wen-wei y 李文偉. "Modeling Construction Unit Rate Estimation Using Multivariate Linear Regression Analysis". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/36024282213244411336.
Texto completo國立中央大學
土木工程研究所
90
Estimation of unit rate is essential for determination of an activity’s duration and its corresponding cost. Traditionally the activity unit rate is estimated by expert’s experience or by simply the statistical average of historical data. Both are rough and inaccurate. This study develops a multivariate linear regression model for the estimation of activity unit rate. More than 400 records of data for activities of concrete placement and steel rebar in bridge superstructure construction were collected and used for the analysis. Two multivariate linear regression model, one for the concrete placement (R2*= 0.8558) and the other for the steel rebar (R2*= 0.7196), are developed as a result. The crew size, unit work quantity, % of foreign workers, and findings are reported.
Wu, Chung-Fu y 吳重孚. "Building the Multivariate Linear Regression Model of DPP-IV Inhibitors". Thesis, 2012. http://ndltd.ncl.edu.tw/handle/19999224791197635235.
Texto completo陳易駿. "Multivariate Multiple Linear Profile Monitoring Based on Partial Least Squares Regression". Thesis, 2014. http://ndltd.ncl.edu.tw/handle/69003909692295256622.
Texto completo國立清華大學
統計學研究所
102
Quality control of the manufacturing process and how to monitor the process more effectively are important issues in the recent. In many practical manufacturing processes, the quality can be expressed by a function of one or more explanatory variables and response variables, and this kind of data is known as profile data. There are many literatures talking about the methods of linear profile monitoring today, but the current methods of linear profile monitoring apply only to the case that the number of observations is sufficient to estimate all regression parameters. For the case that the number of observations is not sufficient to estimate all parameters, there still have no effective methods to monitor the linear profile. For the multivariate multiple linear profile monitoring in the case that the number of observations is not sufficient, this article would propose a control chart based on the partial least squares. We would also use the proposed control chart to perform linear profile monitoring, and then perform the statistical simulation to assess the efficiency. Finally, we would use an example to illustrate how to monitor the linear profile using the proposed control chart.
Liu, Kuo-Chuan y 劉國傳. "A general results for variable selection in multivariate linear regression models". Thesis, 1994. http://ndltd.ncl.edu.tw/handle/91225231234956085304.
Texto completoTsai, Chung-Ting y 蔡忠廷. "Analysis of Variance and Hypothesis Testing for Multivariate Local Linear Regression Models". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/jd639e.
Texto completo國立清華大學
統計學研究所
105
In linear models, it is common to test the difference between two nested models by measuring the difference of their error sums of squares and performing an F-test. Huang and Chen (2008) [7] have extended the structure of this F-test to local polynomial regression (LPR) models (see Fan and Gijbels, 1996 [3]), constructed local and global ANOVA decompositions for LPR models, and defined an F-statistic to test whether a model function fitted by LPR is significant. This thesis extends this F-test to multivariate local linear regression (MLLR) models (see Ruppert and Wand, 1994 [17]) by mimicking a similar framework proposed by Huang and Chen (2008) [7]. We establish local and global ANOVA decompositions for MLLR models, and define two F-statistics corresponding to the following two hypotheses: (i) whether a model function fitted by MLLR is significant, and (ii) whether a model function fitted by MLLR with covariates X_2,..., X_d is more appropriate than a model function fitted by MLLR with covariates X_1,..., X_d. In the bivariate case (d = 2), the type I error and power for these two F-tests are investigated by simulations under different settings of sample sizes, correlations of covariates, values of bandwidth, and signals of rejection, while practical issues of implementing these two F-tests are also discussed, including normalization for the product kernel function. At last, these two F-tests are applied to the analysis of Boston house-price data.
Cheng, Ho-Ming y 鄭賀名. "Developing Multivariate Linear Regression Models to Predict the Electrochemical Performance of Lithium Ion Batteries Based on Material Property Parameters". Thesis, 2018. http://ndltd.ncl.edu.tw/handle/bh5524.
Texto completo國立臺灣科技大學
材料科學與工程系
106
Predicting the electrochemical performance of active materials before their assembly in lithium ion batteries would be a path to cutting costs and time for assembling coin cells and running charging and discharging tests. Therefore, it is valuable to establish a statistical model to precisely predict the electrochemical performance of active materials in lithium ion batteries before cell assembly. In this study, we employed 11 different LiFePO4 powders prepared by manufacturers as the cathode active material and measured its properties, and then prepared cathode electrodes and ran electrochemical experiments. The acquired material property parameters and the electrochemical scores were correlated using multivariate linear regression models. We first used XRD, FTIR, and EA techniques to measure the crystal structure, vibration of PO43- functional group, and the carbon content, respectively. Next we made the cathode electrodes using these 11 LiFePO4 products and assembled them into coin cells, we then ran capacity tests at various current rates and cycleability tests at a 2 C current rate for 1,000 cycles. Estimates of the regression coefficients in the regression models were calculated by the least squares method, and thus the regression models were established. We expect to popularize this powerful material science statistical predictive strategy, to allow future researchers to predict performance of products in a cost-effective and timely manner. In the second analysis of this study, a regression model for predicting polarization potential in CV measurements of LiFePO4/C cathodes was developed based on several material property parameters. In order to assess that whether the predicted values are about the same with the observed data with a 95 % level of confidence, a paired t-test was employed to compare the means of the 2 populations. Moreover, an F-test was applied to examine the ratio of the variance of the two datasets for confirming that the variables of the 2 populations have about the same expectation of the squared deviations from their means. Sample size calculation technique was adopted for evaluating the required minimum sample amount for achieving the predetermined power and significance level for our hypothesis tests in the third analysis. The 4th topic is to use the constrained optimization method to figure out the lowest anticipated overpotential in the CV tests and the corresponding material property parameters according to the fitted equation we established in part 3. In the 5th subject, cycle life tendencies of cathode materials were simulated based on the time series analysis. Time series analysis is beneficial for battery management system to monitor the battery health precisely, and thus would be helpful for improving the safety of LIB. Grey model was furthur employed to construct the prediction equation for assessing the degradation of the cell capacity during long-term cycling. Moreover, the idea based on information entropy also helps us develope the combination model for forecasting, and thus the precision of the resultant model can be improved even more. In the 6th work, principal component analysis was performed on the variables obtained from XRD measurements of all the samples. The original data would be transformed into uncorrelated principal components, and the unimportant vectors can thus be eliminated. Therefore, the remaining principal components can explain the variance of the original data as much as possible. The 7th topic is to perform the factor analysis on a few predictor variables we selected, as a result we can extract the unobserved latent variables which might exist. In the 8th analysis, we made use of the structure equation modelling to visually present the statistical correlations among the observed variables and the latent factors intuitively. This path analysis skill clearly manifested the relative importance of various variables in the regression model. The partial least squares regression method was performed on our experimental results in the 9th subject, consequently we are able to construct the fitted equation when the sample amount we collected is less than the variables we would like to investigate. While the variables were standardized so the regression coefficients can be compared directly, PLS regression model transforms the predictor variables into principal components as well so the variance of the data can be thoroughly accounted for as possible. Within this study, we have utilized 9 statistical analyses to resolve the correlations among the measured experimental data. A number of topics researched in this work include: development of regression equations for forecasting electrochemical performances according to material properties, accuracy verifications of the regression models we established, trend anticipation of the cycle lives of the studied cathode samples, the best battery capability and the associated variable values indicated by the prediction function, and the solutions of the principal components which are able to express the data dispersion as much as possible, etc. We expect these mathematical tools can lead the scientific community to perfect their achievements in the future.
Chen, Ruidi. "Distributionally Robust Learning under the Wasserstein Metric". Thesis, 2019. https://hdl.handle.net/2144/38236.
Texto completoBurombo, Emmanuel Chamunorwa. "Statistical modelling of return on capital employed of individual units". Diss., 2014. http://hdl.handle.net/10500/19627.
Texto completoMathematical Sciences
M. Sc. (Statistics)
Sandström, Sara. "Modellering av volym samt max- och medeldjup i svenska sjöar : en statistisk analys med hjälp av geografiska informationssystem". Thesis, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-325822.
Texto completoReis, Marco Paulo Seabra. "Monitorização, modelação e melhoria de processos químicos : abordagem multiescala baseada em dados". Doctoral thesis, 2006. http://hdl.handle.net/10316/7375.
Texto completoProcesses going on in modern chemical processing plants are typically very complex, and this complexity is also present in collected data, which contain the cumulative effect of many underlying phenomena and disturbances, presenting different patterns in the time/frequency domain. Such characteristics motivate the development and application of data-driven multiscale approaches to process analysis, with the ability of selectively analyzing the information contained at different scales, but, even in these cases, there is a number of additional complicating features that can make the analysis not being completely successful. Missing and multirate data structures are two representatives of the difficulties that can be found, to which we can add multiresolution data structures, among others. On the other hand, some additional requisites should be considered when performing such an analysis, in particular the incorporation of all available knowledge about data, namely data uncertainty information. In this context, this thesis addresses the problem of developing frameworks that are able to perform the required multiscale decomposition analysis while coping with the complex features present in industrial data and, simultaneously, considering measurement uncertainty information. These frameworks are proven to be useful in conducting data analysis in these circumstances, representing conveniently data and the associated uncertainties at the different relevant resolution levels, being also instrumental for selecting the proper scales for conducting data analysis. In line with efforts described in the last paragraph and to further explore the information processed by such frameworks, the integration of uncertainty information on common single-scale data analysis tasks is also addressed. We propose developments in this regard in the fields of multivariate linear regression, multivariate statistical process control and process optimization. The second part of this thesis is oriented towards the development of intrinsically multiscale approaches, where two such methodologies are presented in the field of process monitoring, the first aiming to detect changes in the multiscale characteristics of profiles, while the second is focused on analysing patterns evolving in the time domain.
Ouellette, Marie-Hélène. "L’arbre de régression multivariable et les modèles linéaires généralisés revisités : applications à l’étude de la diversité bêta et à l’estimation de la biomasse d’arbres tropicaux". Thèse, 2011. http://hdl.handle.net/1866/5906.
Texto completoIn ecology, in ecosystem services studies for example, descriptive, explanatory and predictive modelling all have relevance in different situations. Precise circumstances may require one or the other type of modelling; it is important to choose the method properly to insure that the final model fits the study’s goal. In this thesis, we first explore the explanatory power of the multivariate regression tree (MRT). This modelling technique is based on a recursive bipartitionning algorithm. The tree is fully grown by successive bipartitions and then it is pruned by resampling in order to reveal the tree providing the best predictions. This asymmetric analysis of two tables produces homogeneous groups in terms of the response that are constrained by splitting levels in the values of some of the most important explanatory variables. We show that to calculate the explanatory power of an MRT, an appropriate adjusted coefficient of determination must include an estimation of the degrees of freedom of the MRT model through an algorithm. This estimation of the population coefficient of determination is practically unbiased. Since MRT is based upon discontinuity premises whereas canonical redundancy analysis (RDA) models continuous linear gradients, the comparison of their explanatory powers enables one to distinguish between those two patterns of species distributions along the explanatory variables. The extensive use of RDA for the study of beta diversity motivated the comparison between its explanatory power and that of MRT. In an explanatory perspective again, we define a new procedure called a cascade of multivariate regression trees (CMRT). This procedure provides the possibility of computing an MRT model where an order is imposed to nested explanatory hypotheses. CMRT provides a framework to study the exclusive effect of a main and a subordinate set of explanatory variables by calculating their explanatory powers. The interpretation of the final model is done as in nested MANOVA. New information may arise from this analysis about the relationship between the response and the explanatory variables, for example interaction effects between the two explanatory data sets that were not evidenced by the usual MRT model. On the other hand, we study the predictive power of generalized linear models (GLM) to predict individual tropical tree biomass as a function of allometric shape variables. Particularly, we examine the capacity of gaussian and gamma error structures to provide the most precise predictions. We show that for a particular species, gamma error structure is superior in terms of predictive power. This study is part of a practical framework; it is meant to be used as a tool for managers who need to precisely estimate the amount of carbon recaptured by tropical tree plantations. Our conclusions could be integrated within a program of carbon emission reduction by land use changes.