Dissertations / Theses on the topic 'Kernel-based model'

To see the other types of publications on this topic, follow the link: Kernel-based model.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 30 dissertations / theses for your research on the topic 'Kernel-based model.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Bose, Aishwarya. "Effective web service discovery using a combination of a semantic model and a data mining technique." Thesis, Queensland University of Technology, 2008. https://eprints.qut.edu.au/26425/1/Aishwarya_Bose_Thesis.pdf.

Full text
Abstract:
With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
APA, Harvard, Vancouver, ISO, and other styles
2

Bose, Aishwarya. "Effective web service discovery using a combination of a semantic model and a data mining technique." Queensland University of Technology, 2008. http://eprints.qut.edu.au/26425/.

Full text
Abstract:
With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Lin. "Semiparametric Bayesian Kernel Survival Model for Highly Correlated High-Dimensional Data." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/95040.

Full text
Abstract:
We are living in an era in which many mysteries related to science, technologies and design can be answered by "learning" the huge amount of data accumulated over the past few decades. In the processes of those endeavors, highly-correlated high-dimensional data are frequently observed in many areas including predicting shelf life, controlling manufacturing processes, and identifying important pathways related with diseases. We define a "set" as a group of highly-correlated high-dimensional (HCHD) variables that possess a certain practical meaning or control a certain process, and define an "element" as one of the HCHD variables within a certain set. Such an elements-within-a-set structure is very complicated because: (i) the dimensions of elements in different sets can vary dramatically, ranging from two to hundreds or even thousands; (ii) the true relationships, include element-wise associations, set-wise interactions, and element-set interactions, are unknown; (iii) and the sample size (n) is usually much smaller than the dimension of the elements (p). The goal of this dissertation is to provide a systematic way to identify both the set effects and the element effects associated with survival outcomes from heterogeneous populations using Bayesian survival kernel models. By connecting kernel machines with semiparametric Bayesian hierarchical models, the proposed unified model frameworks can identify significant elements as well as sets regardless of mis-specifications of distributions or kernels. The proposed methods can potentially be applied to a vast range of fields to solve real-world problems.
PHD
APA, Harvard, Vancouver, ISO, and other styles
4

Garg, Aditie. "Designing Reactive Power Control Rules for Smart Inverters using Machine Learning." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/83558.

Full text
Abstract:
Due to increasing penetration of solar power generation, distribution grids are facing a number of challenges. Frequent reverse active power flows can result in rapid fluctuations in voltage magnitudes. However, with the revised IEEE 1547 standard, smart inverters can actively control their reactive power injection to minimize voltage deviations and power losses in the grid. Reactive power control and globally optimal inverter coordination in real-time is computationally and communication-wise demanding, whereas the local Volt-VAR or Watt-VAR control rules are subpar for enhanced grid services. This thesis uses machine learning tools and poses reactive power control as a kernel-based regression task to learn policies and evaluate the reactive power injections in real-time. This novel approach performs inverter coordination through non-linear control policies centrally designed by the operator on a slower timescale using anticipated scenarios for load and generation. In real-time, the inverters feed locally and/or globally collected grid data to the customized control rules. The developed models are highly adjustable to the available computation and communication resources. The developed control scheme is tested on the IEEE 123-bus system and is seen to efficiently minimize losses and regulate voltage within the permissible limits.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
5

Kim, Byung-Jun. "Semiparametric and Nonparametric Methods for Complex Data." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/99155.

Full text
Abstract:
A variety of complex data has broadened in many research fields such as epidemiology, genomics, and analytical chemistry with the development of science, technologies, and design scheme over the past few decades. For example, in epidemiology, the matched case-crossover study design is used to investigate the association between the clustered binary outcomes of disease and a measurement error in covariate within a certain period by stratifying subjects' conditions. In genomics, high-correlated and high-dimensional(HCHD) data are required to identify important genes and their interaction effect over diseases. In analytical chemistry, multiple time series data are generated to recognize the complex patterns among multiple classes. Due to the great diversity, we encounter three problems in analyzing those complex data in this dissertation. We have then provided several contributions to semiparametric and nonparametric methods for dealing with the following problems: the first is to propose a method for testing the significance of a functional association under the matched study; the second is to develop a method to simultaneously identify important variables and build a network in HDHC data; the third is to propose a multi-class dynamic model for recognizing a pattern in the time-trend analysis. For the first topic, we propose a semiparametric omnibus test for testing the significance of a functional association between the clustered binary outcomes and covariates with measurement error by taking into account the effect modification of matching covariates. We develop a flexible omnibus test for testing purposes without a specific alternative form of a hypothesis. The advantages of our omnibus test are demonstrated through simulation studies and 1-4 bidirectional matched data analyses from an epidemiology study. For the second topic, we propose a joint semiparametric kernel machine network approach to provide a connection between variable selection and network estimation. Our approach is a unified and integrated method that can simultaneously identify important variables and build a network among them. We develop our approach under a semiparametric kernel machine regression framework, which can allow for the possibility that each variable might be nonlinear and is likely to interact with each other in a complicated way. We demonstrate our approach using simulation studies and real application on genetic pathway analysis. Lastly, for the third project, we propose a Bayesian focal-area detection method for a multi-class dynamic model under a Bayesian hierarchical framework. Two-step Bayesian sequential procedures are developed to estimate patterns and detect focal intervals, which can be used for gas chromatography. We demonstrate the performance of our proposed method using a simulation study and real application on gas chromatography on Fast Odor Chromatographic Sniffer (FOX) system.
Doctor of Philosophy
A variety of complex data has broadened in many research fields such as epidemiology, genomics, and analytical chemistry with the development of science, technologies, and design scheme over the past few decades. For example, in epidemiology, the matched case-crossover study design is used to investigate the association between the clustered binary outcomes of disease and a measurement error in covariate within a certain period by stratifying subjects' conditions. In genomics, high-correlated and high-dimensional(HCHD) data are required to identify important genes and their interaction effect over diseases. In analytical chemistry, multiple time series data are generated to recognize the complex patterns among multiple classes. Due to the great diversity, we encounter three problems in analyzing the following three types of data: (1) matched case-crossover data, (2) HCHD data, and (3) Time-series data. We contribute to the development of statistical methods to deal with such complex data. First, under the matched study, we discuss an idea about hypothesis testing to effectively determine the association between observed factors and risk of interested disease. Because, in practice, we do not know the specific form of the association, it might be challenging to set a specific alternative hypothesis. By reflecting the reality, we consider the possibility that some observations are measured with errors. By considering these measurement errors, we develop a testing procedure under the matched case-crossover framework. This testing procedure has the flexibility to make inferences on various hypothesis settings. Second, we consider the data where the number of variables is very large compared to the sample size, and the variables are correlated to each other. In this case, our goal is to identify important variables for outcome among a large amount of the variables and build their network. For example, identifying few genes among whole genomics associated with diabetes can be used to develop biomarkers. By our proposed approach in the second project, we can identify differentially expressed and important genes and their network structure with consideration for the outcome. Lastly, we consider the scenario of changing patterns of interest over time with application to gas chromatography. We propose an efficient detection method to effectively distinguish the patterns of multi-level subjects in time-trend analysis. We suggest that our proposed method can give precious information on efficient search for the distinguishable patterns so as to reduce the burden of examining all observations in the data.
APA, Harvard, Vancouver, ISO, and other styles
6

Polajnar, Tamara. "Semantic models as metrics for kernel-based interaction identification." Thesis, University of Glasgow, 2010. http://theses.gla.ac.uk/2260/.

Full text
Abstract:
Automatic detection of protein-protein interactions (PPIs) in biomedical publications is vital for efficient biological research. It also presents a host of new challenges for pattern recognition methodologies, some of which will be addressed by the research in this thesis. Proteins are the principal method of communication within a cell; hence, this area of research is strongly motivated by the needs of biologists investigating sub-cellular functions of organisms, diseases, and treatments. These researchers rely on the collaborative efforts of the entire field and communicate through experimental results published in reviewed biomedical journals. The substantial number of interactions detected by automated large-scale PPI experiments, combined with the ease of access to the digitised publications, has increased the number of results made available each day. The ultimate aim of this research is to provide tools and mechanisms to aid biologists and database curators in locating relevant information. As part of this objective this thesis proposes, studies, and develops new methodologies that go some way to meeting this grand challenge. Pattern recognition methodologies are one approach that can be used to locate PPI sentences; however, most accurate pattern recognition methods require a set of labelled examples to train on. For this particular task, the collection and labelling of training data is highly expensive. On the other hand, the digital publications provide a plentiful source of unlabelled data. The unlabelled data is used, along with word cooccurrence models, to improve classification using Gaussian processes, a probabilistic alternative to the state-of-the-art support vector machines. This thesis presents and systematically assesses the novel methods of using the knowledge implicitly encoded in biomedical texts and shows an improvement on the current approaches to PPI sentence detection.
APA, Harvard, Vancouver, ISO, and other styles
7

Lyubchyk, Leonid, Oleksy Galuza, and Galina Grinberg. "Ranking Model Real-Time Adaptation via Preference Learning Based on Dynamic Clustering." Thesis, ННК "IПСА" НТУУ "КПI iм. Iгоря Сiкорського", 2017. http://repository.kpi.kharkov.ua/handle/KhPI-Press/36819.

Full text
Abstract:
The proposed preference learning on clusters method allows to fully realizing the advantages of the kernel-based approach. While the dimension of the model is determined by a pre-selected number of clusters and its complexity do not grow with increasing number of observations. Thus real-time preference function identification algorithm based on training data stream includes successive estimates of cluster parameter as well as average cluster ranks updating and recurrent kernel-based nonparametric estimation of preference model.
APA, Harvard, Vancouver, ISO, and other styles
8

Vlachos, Dimitrios. "Novel algorithms in wireless CDMA systems for estimation and kernel based equalization." Thesis, Brunel University, 2012. http://bura.brunel.ac.uk/handle/2438/7658.

Full text
Abstract:
A powerful technique is presented for joint blind channel estimation and carrier offset method for code- division multiple access (CDMA) communication systems. The new technique combines singular value decomposition (SVD) analysis with carrier offset parameter. Current blind methods sustain a high computational complexity as they require the computation of a large SVD twice, and they are sensitive to accurate knowledge of the noise subspace rank. The proposed method overcomes both problems by computing the SVD only once. Extensive simulations using MatLab demonstrate the robustness of the proposed scheme and its performance is comparable to other existing SVD techniques with significant lower computational as much as 70% cost because it does not require knowledge of the rank of the noise sub-space. Also a kernel based equalization for CDMA communication systems is proposed, designed and simulated using MatLab. The proposed method in CDMA systems overcomes all other methods.
APA, Harvard, Vancouver, ISO, and other styles
9

Buch, Armin [Verfasser], and Gerhard [Akademischer Betreuer] Jäger. "Linguistic Spaces : Kernel-based models of natural language / Armin Buch ; Betreuer: Gerhard Jäger." Tübingen : Universitätsbibliothek Tübingen, 2011. http://d-nb.info/1161803572/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Mahfouz, Sandy. "Kernel-based machine learning for tracking and environmental monitoring in wireless sensor networkds." Thesis, Troyes, 2015. http://www.theses.fr/2015TROY0025/document.

Full text
Abstract:
Cette thèse porte sur les problèmes de localisation et de surveillance de champ de gaz à l'aide de réseaux de capteurs sans fil. Nous nous intéressons d'abord à la géolocalisation des capteurs et au suivi de cibles. Nous proposons ainsi une approche exploitant la puissance des signaux échangés entre les capteurs et appliquant les méthodes à noyaux avec la technique de fingerprinting. Nous élaborons ensuite une méthode de suivi de cibles, en se basant sur l'approche de localisation proposée. Cette méthode permet d'améliorer la position estimée de la cible en tenant compte de ses accélérations, et cela à l'aide du filtre de Kalman. Nous proposons également un modèle semi-paramétrique estimant les distances inter-capteurs en se basant sur les puissances des signaux échangés entre ces capteurs. Ce modèle est une combinaison du modèle physique de propagation avec un terme non linéaire estimé par les méthodes à noyaux. Les données d'accélérations sont également utilisées ici avec les distances, pour localiser la cible, en s'appuyant sur un filtrage de Kalman et un filtrage particulaire. Dans un autre contexte, nous proposons une méthode pour la surveillance de la diffusion d'un gaz dans une zone d'intérêt, basée sur l'apprentissage par noyaux. Cette méthode permet de détecter la diffusion d'un gaz en utilisant des concentrations relevées régulièrement par des capteurs déployés dans la zone. Les concentrations mesurées sont ensuite traitées pour estimer les paramètres de la source de gaz, notamment sa position et la quantité du gaz libéré
This thesis focuses on the problems of localization and gas field monitoring using wireless sensor networks. First, we focus on the geolocalization of sensors and target tracking. Using the powers of the signals exchanged between sensors, we propose a localization method combining radio-location fingerprinting and kernel methods from statistical machine learning. Based on this localization method, we develop a target tracking method that enhances the estimated position of the target by combining it to acceleration information using the Kalman filter. We also provide a semi-parametric model that estimates the distances separating sensors based on the powers of the signals exchanged between them. This semi-parametric model is a combination of the well-known log-distance propagation model with a non-linear fluctuation term estimated within the framework of kernel methods. The target's position is estimated by incorporating acceleration information to the distances separating the target from the sensors, using either the Kalman filter or the particle filter. In another context, we study gas diffusions in wireless sensor networks, using also machine learning. We propose a method that allows the detection of multiple gas diffusions based on concentration measures regularly collected from the studied region. The method estimates then the parameters of the multiple gas sources, including the sources' locations and their release rates
APA, Harvard, Vancouver, ISO, and other styles
11

Zhai, Jing. "Efficient Exact Tests in Linear Mixed Models for Longitudinal Microbiome Studies." Thesis, The University of Arizona, 2016. http://hdl.handle.net/10150/612412.

Full text
Abstract:
Microbiome plays an important role in human health. The analysis of association between microbiome and clinical outcome has become an active direction in biostatistics research. Testing the microbiome effect on clinical phenotypes directly using operational taxonomic unit abundance data is a challenging problem due to the high dimensionality, non-normality and phylogenetic structure of the data. Most of the studies only focus on describing the change of microbe population that occur in patients who have the specific clinical condition. Instead, a statistical strategy utilizing distance-based or similarity-based non-parametric testing, in which a distance or similarity measure is defined between any two microbiome samples, is developed to assess association between microbiome composition and outcomes of interest. Despite the improvements, this test is still not easily interpretable and not able to adjust for potential covariates. A novel approach, kernel-based semi-parametric regression framework, is applied in evaluating the association while controlling the covariates. The framework utilizes a kernel function which is a measure of similarity between samples' microbiome compositions and characterizes the relationship between the microbiome and the outcome of interest. This kernel-based regression model, however, cannot be applied in longitudinal studies since it could not model the correlation between the repeated measurements. We proposed microbiome association exact tests (MAETs) in linear mixed model can deal with longitudinal microbiome data. MAETs can test not only the effect of overall microbiome but also the effect from specific cluster of the OTUs while controlling for others by introducing more random effects in the model. The current methods for multiple variance component testing are based on either asymptotic distribution or parametric bootstrap which require large sample size or high computational cost. The exact (R)LRT tests, an computational efficient and powerful testing methodology, was derived by Crainiceanu. Since the exact (R)LRT can only be used in testing one variance component, we proposed an approach that combines the recent development of exact (R)LRT and a strategy for simplifying linear mixed model with multiple variance components to a single case. The Monte Carlo simulation studies present correctly controlled type I error and provided superior power in testing association between microbiome and outcomes in longitudinal studies. Finally, the MAETs were applied to longitudinal pulmonary microbiome datasets to demonstrate that microbiome composition is associated with lung function and immunological outcomes. We also successfully found two interesting genera Prevotella and Veillonella which are associated with forced vital capacity.
APA, Harvard, Vancouver, ISO, and other styles
12

Funke, Benedikt [Verfasser], Jeannette H. C. [Akademischer Betreuer] Woerner, and Herold [Gutachter] Dehling. "Kernel based nonparametric coefficient estimation in diffusion models / Benedikt Funke. Betreuer: Jeannette H. C. Woerner. Gutachter: Herold Dehling." Dortmund : Universitätsbibliothek Dortmund, 2015. http://d-nb.info/1111103275/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Strengbom, Kristoffer. "Mobile Services Based Traffic Modeling." Thesis, Linköpings universitet, Matematisk statistik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-116459.

Full text
Abstract:
Traditionally, communication systems have been dominated by voice applications. Today with the emergence of smartphones, focus has shifted towards packet switched networks. The Internet provides a wide variety of services such as video streaming, web browsing, e-mail etc, and IP trac models are needed in all stages of product development, from early research to system tests. In this thesis, we propose a multi-level model of IP traffic where the user behavior and the actual IP traffic generated from different services are considered as being two independent random processes. The model is based on observations of IP packet header logs from live networks. In this way models can be updated to reflect the ever changing service and end user equipment usage. Thus, the work can be divided into two parts. The first part is concerned with modeling the traffic from different services. A subscriber is interested in enjoying the services provided on the Internet and traffic modeling should reflect the characteristics of these services. An underlying assumption is that different services generate their own characteristic pattern of data. The FFT is used to analyze the packet traces. We show that the traces contains strong periodicities and that some services are more or less deterministic. For some services this strong frequency content is due to the characteristics of cellular network and for other it is actually a programmed behavior of the service. The periodicities indicate that there are strong correlations between individual packets or bursts of packets. The second part is concerned with the user behavior, i.e. how the users access the different services in time. We propose a model based on a Markov renewal process and estimate the model parameters. In order to evaluate the model we compare it to two simpler models. We use model selection, using the model's ability to predict future observations as selection criterion. We show that the proposed Markov renewal model is the best of the three models in this sense. The model selection framework can be used to evaluate future models.
APA, Harvard, Vancouver, ISO, and other styles
14

Ahmed, Mohamed Salem. "Contribution à la statistique spatiale et l'analyse de données fonctionnelles." Thesis, Lille 3, 2017. http://www.theses.fr/2017LIL30047/document.

Full text
Abstract:
Ce mémoire de thèse porte sur la statistique inférentielle des données spatiales et/ou fonctionnelles. En effet, nous nous sommes intéressés à l’estimation de paramètres inconnus de certains modèles à partir d’échantillons obtenus par un processus d’échantillonnage aléatoire ou non (stratifié), composés de variables indépendantes ou spatialement dépendantes.La spécificité des méthodes proposées réside dans le fait qu’elles tiennent compte de la nature de l’échantillon étudié (échantillon stratifié ou composé de données spatiales dépendantes).Tout d’abord, nous étudions des données à valeurs dans un espace de dimension infinie ou dites ”données fonctionnelles”. Dans un premier temps, nous étudions les modèles de choix binaires fonctionnels dans un contexte d’échantillonnage par stratification endogène (échantillonnage Cas-Témoin ou échantillonnage basé sur le choix). La spécificité de cette étude réside sur le fait que la méthode proposée prend en considération le schéma d’échantillonnage. Nous décrivons une fonction de vraisemblance conditionnelle sous l’échantillonnage considérée et une stratégie de réduction de dimension afin d’introduire une estimation du modèle par vraisemblance conditionnelle. Nous étudions les propriétés asymptotiques des estimateurs proposées ainsi que leurs applications à des données simulées et réelles. Nous nous sommes ensuite intéressés à un modèle linéaire fonctionnel spatial auto-régressif. La particularité du modèle réside dans la nature fonctionnelle de la variable explicative et la structure de la dépendance spatiale des variables de l’échantillon considéré. La procédure d’estimation que nous proposons consiste à réduire la dimension infinie de la variable explicative fonctionnelle et à maximiser une quasi-vraisemblance associée au modèle. Nous établissons la consistance, la normalité asymptotique et les performances numériques des estimateurs proposés.Dans la deuxième partie du mémoire, nous abordons des problèmes de régression et prédiction de variables dépendantes à valeurs réelles. Nous commençons par généraliser la méthode de k-plus proches voisins (k-nearest neighbors; k-NN) afin de prédire un processus spatial en des sites non-observés, en présence de co-variables spatiaux. La spécificité du prédicteur proposé est qu’il tient compte d’une hétérogénéité au niveau de la co-variable utilisée. Nous établissons la convergence presque complète avec vitesse du prédicteur et donnons des résultats numériques à l’aide de données simulées et environnementales.Nous généralisons ensuite le modèle probit partiellement linéaire pour données indépendantes à des données spatiales. Nous utilisons un processus spatial linéaire pour modéliser les perturbations du processus considéré, permettant ainsi plus de flexibilité et d’englober plusieurs types de dépendances spatiales. Nous proposons une approche d’estimation semi paramétrique basée sur une vraisemblance pondérée et la méthode des moments généralisées et en étudions les propriétés asymptotiques et performances numériques. Une étude sur la détection des facteurs de risque de cancer VADS (voies aéro-digestives supérieures)dans la région Nord de France à l’aide de modèles spatiaux à choix binaire termine notre contribution
This thesis is about statistical inference for spatial and/or functional data. Indeed, weare interested in estimation of unknown parameters of some models from random or nonrandom(stratified) samples composed of independent or spatially dependent variables.The specificity of the proposed methods lies in the fact that they take into considerationthe considered sample nature (stratified or spatial sample).We begin by studying data valued in a space of infinite dimension or so-called ”functionaldata”. First, we study a functional binary choice model explored in a case-controlor choice-based sample design context. The specificity of this study is that the proposedmethod takes into account the sampling scheme. We describe a conditional likelihoodfunction under the sampling distribution and a reduction of dimension strategy to definea feasible conditional maximum likelihood estimator of the model. Asymptotic propertiesof the proposed estimates as well as their application to simulated and real data are given.Secondly, we explore a functional linear autoregressive spatial model whose particularityis on the functional nature of the explanatory variable and the structure of the spatialdependence. The estimation procedure consists of reducing the infinite dimension of thefunctional variable and maximizing a quasi-likelihood function. We establish the consistencyand asymptotic normality of the estimator. The usefulness of the methodology isillustrated via simulations and an application to some real data.In the second part of the thesis, we address some estimation and prediction problemsof real random spatial variables. We start by generalizing the k-nearest neighbors method,namely k-NN, to predict a spatial process at non-observed locations using some covariates.The specificity of the proposed k-NN predictor lies in the fact that it is flexible and allowsa number of heterogeneity in the covariate. We establish the almost complete convergencewith rates of the spatial predictor whose performance is ensured by an application oversimulated and environmental data. In addition, we generalize the partially linear probitmodel of independent data to the spatial case. We use a linear process for disturbancesallowing various spatial dependencies and propose a semiparametric estimation approachbased on weighted likelihood and generalized method of moments methods. We establishthe consistency and asymptotic distribution of the proposed estimators and investigate thefinite sample performance of the estimators on simulated data. We end by an applicationof spatial binary choice models to identify UADT (Upper aerodigestive tract) cancer riskfactors in the north region of France which displays the highest rates of such cancerincidence and mortality of the country
APA, Harvard, Vancouver, ISO, and other styles
15

Nguyen, Van Hanh. "Modèles de mélange semi-paramétriques et applications aux tests multiples." Phd thesis, Université Paris Sud - Paris XI, 2013. http://tel.archives-ouvertes.fr/tel-00987035.

Full text
Abstract:
Dans un contexte de test multiple, nous considérons un modèle de mélange semi-paramétrique avec deux composantes. Une composante est supposée connue et correspond à la distribution des p-valeurs sous hypothèse nulle avec probabilité a priori p. L'autre composante f est nonparamétrique et représente la distribution des p-valeurs sous l'hypothèse alternative. Le problème d'estimer les paramètres p et f du modèle apparaît dans les procédures de contrôle du taux de faux positifs (''false discovery rate'' ou FDR). Dans la première partie de cette dissertation, nous étudions l'estimation de la proportion p. Nous discutons de résultats d'efficacité asymptotique et établissons que deux cas différents arrivent suivant que f s'annule ou non surtout un intervalle non-vide. Dans le premier cas (annulation surtout un intervalle), nous présentons des estimateurs qui convergent \' la vitesse paramétrique, calculons la variance asymptotique optimale et conjecturons qu'aucun estimateur n'est asymptotiquement efficace (i.e atteint la variance asymptotique optimale). Dans le deuxième cas, nous prouvons que le risque quadratique de n'importe quel estimateur ne converge pas à la vitesse paramétrique. Dans la deuxième partie de la dissertation, nous nous concentrons sur l'estimation de la composante inconnue nonparamétrique f dans le mélange, en comptant sur un estimateur préliminaire de p. Nous proposons et étudions les propriétés asymptotiques de deux estimateurs différents pour cette composante inconnue. Le premier estimateur est un estimateur à noyau avec poids aléatoires. Nous établissons une borne supérieure pour son risque quadratique ponctuel, en montrant une vitesse de convergence nonparamétrique classique sur une classe de Holder. Le deuxième estimateur est un estimateur du maximum de vraisemblance régularisée. Il est calculé par un algorithme itératif, pour lequel nous établissons une propriété de décroissance d'un critère. De plus, ces estimateurs sont utilisés dans une procédure de test multiple pour estimer le taux local de faux positifs (''local false discovery rate'' ou lfdr).
APA, Harvard, Vancouver, ISO, and other styles
16

Jer-Fu, Liu, and 劉哲夫. "The Development of a Kernel for OODB Based on V-R Model." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/85060689060580327913.

Full text
Abstract:
碩士
國立交通大學
資訊科學學系
83
V-R模型(V-R model)是一個物件導向資料庫(OODB)的概念模型,此模 能 夠滿足物件導向資料庫的基本需求,而且對於物件的管理和查詢處 具有 一些良好的特性,因此我們採用此模型來發展OODB的核心系統。A者,由 於OODB的發展至目前為止,各項技術仍尚未完全成熟,因此希磏ォ悝畯怍 珛o展的系統不僅能縮短資料庫的開發時間,更能用來研發M測試各種適合 OODB的製作技術,例如查詢處理,物件儲存器等相關研s。由於V-R模型只 是一個概念模型,因此我們必須提出有效率的實作方k,並且設計一個完 善的系統架構,才能據此架構發展出我們所需的系峞C我們所發展的系統 允許物件身分的變換,因此在資料庫的使用上更膃蛣M性。而且使得物件 的資料更具一致性,所以可以減少物件修改的x難。同時此系統可以提供 良好的景象機制。最重要的是我們提供了一茯蒫oOODB的核心工具。最後 為了測試這個核心系統,我們利用此系統o展出一套物件導向資料庫系統 ,稱為EODBEasy use and efficient Object-oriented DataBase)。 We have developed a kernel for OODB in this thesis, basedn V-R model. V-R model, basically, is a conceptual model,hich can satisfy the fundamental requirements of the OODB.o it is natural and reasonable to develop our system,ased on the V-R model. By utilizing this kernel, we caneduce the research and development time for building aatabase system. Besides, we hope our kernel can also helpo test and develop those techniques necessary in an OODBystem, such as query processor, object storage, etc.ecause V-R model is only a conceptual model, we have toropose an effective way to implement it. Our system hashe following advantages: (1) extension of object*s role ,2) data integrity , (3) data sharing, and (4) a good viewechanism. Finally , we implement an OODB with our designedernel, named EODB(Easy use and efficient Object-orientedataBase) in our Database Lab. at Chiao Tung University.
APA, Harvard, Vancouver, ISO, and other styles
17

Hao, Pei-Yi, and 郝沛毅. "Fuzzy Decision Model Using Support Vector Learning — A Kernel Function Based Approach." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/18494175702000813535.

Full text
Abstract:
博士
國立成功大學
資訊工程學系碩博士班
91
Support Vector Machines (SVMs) have been recently introduced as a new technique for solving pattern recognition problems. According to the theory of SVMs, while traditional techniques for pattern recognition are based on the minimization of the empirical risk, that is, on the attempt to optimize the performance on the training set, SVMs minimize the structural risk, that is, the probability of misclassifying yet-to-been-seen patterns for a fixed but unknown probability distribution of the data. Fuzziness must be considered in systems where human estimation is influential. In this thesis, we incorporate the concept of fuzzy set theory into the support vector machine decision model in several approaches. We attempt to preserve the advantages of support vector machine (i.e. well generalization ability) and fuzzy set theory (i.e. closer to human thinking). First, we propose a fuzzy modeling framework based on support vector machine, a rule-based framework that explicitly characterizes the representation in fuzzy inference procedure. The support vector learning mechanism provides an architecture to extract support vectors for generating fuzzy IF-THEN rules from the training data set, and a method to describe the fuzzy system in terms of kernel functions. Thus, it has the inherent advantage that the model does not have to determine the number of rules in advance. Moreover, the decision model is not a black box anymore. Second, we enlarge SVM clustering by using a generalized ordered weighted averaging (OWA) operator to make multi-shpere SV clustering is capable of adaptively growing cell when a new point (but doesn’t belong to any existed cluster) is presented. Each sphere in the feature space corresponds to a cluster in original space and, whereby, it is possible to obtain the grade of fuzzy memberships, as well as cluster prototypes (sphere center) in partition. Third, we incorporate the concept of fuzzy set theory into the SVM regression. The parameters to be identified in SVM regression, such as the components within the weight vector and the bias term, are fuzzy numbers. The desired outputs in training samples are also fuzzy numbers. The SVM’s theory characterizes properties of learning machines which enable them to generalize well the unseen data and the fuzzy set theory might be very useful for finding a fuzzy structure in an evaluation system.
APA, Harvard, Vancouver, ISO, and other styles
18

Yang, Hsin-Yu, and 楊新宇. "Transformation Model for Interval Censoring with a Cured Subgroup by Kernel-based Estimation." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/31888676456283846796.

Full text
Abstract:
碩士
淡江大學
統計學系碩士班
103
As time progresses, continuous development, there are more and more interval censoring data with clinical trials. Sometimes, it is hard to observe the exact time of event, but we know the observed failure time falls within a time period. In this thesis, we consider mixture cure models for interval censored data with a cured subgroup, where subjects in this subgroup are not susceptible to the event of interest. We suppose logistic regression to estimate cure proportion. In addition, we consider semiparametric transformation models to analysis the event data. We focus on reparametrizing the step function of unknown baseline hazard function by the logarithm of its jump sizes in Chapter 3, and a kernel-based approach for smooth estimation of unknown baseline hazard function in Chapter 4. The EM algorithm is developed for the estimation and simulation studies are conducted.
APA, Harvard, Vancouver, ISO, and other styles
19

Su, Shao-Zu, and 蘇少祖. "Using Kernel Smoothing Approaches Imporves the Parameter Estimation based on Generalized Partial Credit Model." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/75909558041762230547.

Full text
Abstract:
碩士
國立臺中教育大學
教育測驗統計研究所
96
In this paper, a modified version of MMLE/EM is proposed. There are two modifications in the proposed algorithm. First, kernel density estimation technique is applied to estimate the distribution of ability parameter in E-step. Second, kernel density estimation technique is applied to estimate the item parameters and ability parameters with EAP in M-step. Finally, we use this methodology to estimate the ability and item parameters iteratively. This algorithm is named kernel smoothing - generalized partial credit model , KS-GPCM for short. In this paper, a simulation experiment based on the generalized partial credit model is conducted to compare the performances of PARSCALE and KS-GPCM. In the experiment, three types of distributions of ability parameters (normal, bi-mode and skewed distributions) are considered. Experimental results show as follow: (i) When distribution of ability parameter is normally distributed, RMSE of ability parameter of PARSCALE is less than KS-GPCM. (ii) When distributions of ability parameters are bimodal and skewness, RMSE of ability parameter of KS-GPCM is less than PARSCALE. (iii) When distribution of ability parameter is normally distributed, RMSE of slope and item step parameters of PARSCALE is less than KS-GPCM. (iv) When distributions of ability parameters are bimodal and skewness, RMSE of slope and item step parameters of KS-GPCM is less than PARSCALE.
APA, Harvard, Vancouver, ISO, and other styles
20

Zhang, Rui. "Model selection techniques for kernel-based regression analysis using information complexity measure and genetic algorithms." 2007. http://etd.utk.edu/2007/ZhangRui.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Shun-Te, O., and 歐順德. "The Monte Carlo Simulation Study of The Hybrid Model of Generalized Hidden Markov Model and Kernel smoothing based Item Response Theory." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/02075678413122792655.

Full text
Abstract:
碩士
亞洲大學
資訊工程學系碩士班
94
Item characteristic curve is the central concept of the item response theory, the accuracy of ICC estimation of IRT model that is unable to prove better or not with mathematical or statistical method or the logic rule. The main purpose of this study rely on simulation to compare the accuracy of ICC estimation of four IRT Models, i.e. three-parameter logistic IRT model, the hybrid model of GHMM and 2PL-IRT, kernel smoothing based IRT, the hybrid model of GHMM and kernel smoothing based IRT. Simulation utilized MATLAB software to develop programs and simulate the data needed. Supposing discrimination parameter, difficulty parameter and ability parameter are normal distribution, guessing parameter is uniform distribution, item numbers are 25, there are six different numbers of examinees : 100,200,500,1000,1500 and 2000. According to this study, several findings have been concluded as follows: 1. The hybrid model of GHMM and kernel smoothing based IRT is better than the other’s IRT models for the accuracy of ICC estimation. 2. No matter parameter or nonparameter IRT model with GHMM is more accurate for the accuracy of ICC estimation. 3. The size of examinees will influence the accuracy of ICC estimation, more examinees are more accurate.
APA, Harvard, Vancouver, ISO, and other styles
22

Liao, Wei-Chieh, and 廖偉捷. "Real-Time Surface Defect Inspection based on Single Kernel Multiple Threads Computing Model of Compute Unified Device Architecture." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/31581986489037776680.

Full text
Abstract:
碩士
中原大學
機械工程研究所
101
Techniques of surface defect detection play an important role in the product quality control. Human visual inspection is time-consuming, error-prone, and labor intensive. Due to the rapid development in the fields of machine vision, image processing, and high performance computing, a variety of different applications can benefit from the combination of technologies in these fields. A primary industry application is the automatic surface defect detection. In the thesis, for objects with a large surface area, the CUDA (Compute Unified Device Architecture) technology is adopted to satisfy both the high-speed and high-precision requirements on the surface defect detection. The CUDA is a heterogeneous computing platform and programming model that integrates the CPU (Central Processing Unit) and GPU (Graphics Processing Unit) components. In the proposed method, based on a GPU’s maximum allowable threads, the image of an object is divided into different image blocks. The image blocks are successively processed by the GPU, via concurrent threads, to determine the edges and number of defects in each single image block. On the other hand, the CPU determines the number of defects that spans adjacent image blocks, in order to obtain the correct number of defects in the entire image. Experimental results show that the algorithms developed in the thesis can accurately obtain the edges and number of defects inside an image, containing 2.4576×〖10〗^8 pixels, in less than one second.
APA, Harvard, Vancouver, ISO, and other styles
23

Nguyen, Quang Anh. "Advanced methods and extensions for kernel-based object tracking." Phd thesis, 2010. http://hdl.handle.net/1885/150670.

Full text
Abstract:
In today's world, the rapid developments in computing technology have generated a great deal of interest in automated video analysis systems such as smart cars, video indexing services and advanced surveillance networks. Amongst those, object tracking research plays a pivotal role in providing an unobtrusive means to detect, track and alert the presence of the objects of interest in a given scene with little to no supervisor interaction. The application fields for object tracking can vary from smart vehicles, advanced surveillance networks to sport analysis and perceptual user interfaces. The diversity in its applications also gives rise to a number of different tracking algorithms tailored to suit the corresponding scenarios and constraints. Along this line, the kernel-based tracker has emerged as one of the benchmark tracking algorithms due to its real-time performance, robustness to noise and tracking accuracy. In this thesis, we explore the possibility of further enhancing the original kernel-based tracker. We do this by firstly developing a probabilistic formulation for the mean-shift algorithm which, in turn, provides a means to estimate the target's severe transformations in size, shape and orientation. For changes in colour appearance due to poor lighting condition of the scene, we opt to combine multiple low complexity image features such as texture, contrast, brightness and colour to improve the tracking performance. To achieve this, we advocate the use of a graphical model to abstract the image features under study into a relational structure and, subsequently, make use of graph-spectral methods to combine them linearly and in a straight-forward manner. Furthermore, we also present an on-line updating method to adjust the tracking model to incorporate new changes in the image features during the course of tracking. To cope with the problem of object occlusion in a high density traffic area, we propose a geometric method to extend the mean-shift algorithm to a 3D setting by the use of multiple cameras with overlapped views of the scene. The methods presented in this thesis not only show significant performance improvements on real-world sequences over a number of benchmark algorithms, but also encompass high generalisation in the spatial and feature domains for future development purposes.
APA, Harvard, Vancouver, ISO, and other styles
24

Chun-Hsien, Lee. "Video Object Analysis Using Kernel-based Models and Spatiotemporal Similarity." 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0009-2007200610102100.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Lee, Chun-Hsien, and 李俊賢. "Video Object Analysis Using Kernel-based Models and Spatiotemporal Similarity." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/27362489915766268020.

Full text
Abstract:
碩士
元智大學
電機工程學系
94
Video object segmentation plays an important role in many advanced application such as human-computer interaction, video surveillance, content-based video coding. In this paper we proposes a semantic video object segmentation system which combines spatiotemporal video segmentation and region tracking together to extract important semantic objects from videos. At beginning, the paper uses multiple cues to segment video frames to different regions. The cues include color, edges, motions, and kernel-based models. Since these features are complementary to each other, all desired regions can be well segmented from input frames even though they are captured from a non-stationary camera. Then, according to spatial information of each segmented region, we can construct a region adjacency graph (RAG) which can well record the relative relations between each region. Based on the RAG, we propose a Bayesian classifier which can group regions by properly checking their spatial and temporal similarities such that different regions will be merged and associated together to form a meaningful object. Since we include a kernel-based analysis into the designed classier, all desired semantic objects can be well extracted from video sequences. The kernel-based analysis can provide rich information for segmenting semantic objects if they are still in the background and cannot be identified using other features like motions. Experimental results have proved the superiority of the proposed method in object segmentation.
APA, Harvard, Vancouver, ISO, and other styles
26

Konur, Savas, and Marian Gheorghe. "Proceedings of the Workshop on Membrane Computing, WMC 2016." 2016. http://hdl.handle.net/10454/8840.

Full text
Abstract:
yes
This Workshop on Membrane Computing, at the Conference of Unconventional Computation and Natural Computation (UCNC), 12th July 2016, Manchester, UK, is the second event of this type after the Workshop at UCNC 2015 in Auckland, New Zealand*. Following the tradition of the 2015 Workshop the Proceedings are published as technical report. The Workshop consisted of one invited talk and six contributed presentations (three full papers and three extended abstracts) covering a broad spectrum of topics in Membrane Computing, from computational and complexity theory to formal verification, simulation and applications in robotics. All these papers – see below, but the last extended abstract, are included in this volume. The invited talk given by Rudolf Freund, “P SystemsWorking in Set Modes”, presented a general overview on basic topics in the theory of Membrane Computing as well as new developments and future research directions in this area. Radu Nicolescu in “Distributed and Parallel Dynamic Programming Algorithms Modelled on cP Systems” presented an interesting dynamic programming algorithm in a distributed and parallel setting based on P systems enriched with adequate data structure and programming concepts representation. Omar Belingheri, Antonio E. Porreca and Claudio Zandron showed in “P Systems with Hybrid Sets” that P systems with negative multiplicities of objects are less powerful than Turing machines. Artiom Alhazov, Rudolf Freund and Sergiu Ivanov presented in “Extended Spiking Neural P Systems with States” new results regading the newly introduced topic of spiking neural P systems where states are considered. “Selection Criteria for Statistical Model Checker”, by Mehmet E. Bakir and Mike Stannett, presented some early experiments in selecting adequate statistical model checkers for biological systems modelled with P systems. In “Towards Agent-Based Simulation of Kernel P Systems using FLAME and FLAME GPU”, Raluca Lefticaru, Luis F. Macías-Ramos, Ionuţ M. Niculescu, Laurenţiu Mierlă presented some of the advatages of implementing kernel P systems simulations in FLAME. Andrei G. Florea and Cătălin Buiu, in “An Efficient Implementation and Integration of a P Colony Simulator for Swarm Robotics Applications" presented an interesting and efficient implementation based on P colonies for swarms of Kilobot robots. *http://ucnc15.wordpress.fos.auckland.ac.nz/workshop-on-membrane-computingwmc- at-the-conference-on-unconventional-computation-natural-computation/
APA, Harvard, Vancouver, ISO, and other styles
27

Feng-XuLi and 李豐旭. "Gabor-Based Kernel PCA with Fractional Power Polynomial Models for Gait Recognition." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/q2e8w3.

Full text
Abstract:
碩士
國立成功大學
電機工程學系碩士在職專班
103
In this thesis, we propose a method to extract the human gait features from the surveillance video through Gabor wavelet transformation, and then we classify these features by kernel principle component analysis (PCA) with the fractional power polynomial model. Because human gait feature extraction can be categorized into spatial and temporal domain, we will discuss the gait features in these two domains. In order not to lose any information from the surveillance video, this thesis uses the spatial-temporal silhouette of the people walking in the surveillance video, then we can have the gait features by taking silhouette convolution with Gabor based wavelet transformation. We classify these features by kernel PCA with the fractional power polynomial model. Finally, we use Mahalanobis distance to measure the similarity between the gait features. The simulation and the experiment results show that Gabor-based kernel PCA with fractional power polynomial models for Gait recognition have a better performance.
APA, Harvard, Vancouver, ISO, and other styles
28

Huang, Jhu-Yun, and 黃筑妘. "Liver Segmentation by Kernel-Based Deformable Shape Models in 3D medical Images." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/55967093346722649181.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Vicente, Sergio. "Apprentissage statistique avec le processus ponctuel déterminantal." Thesis, 2021. http://hdl.handle.net/1866/25249.

Full text
Abstract:
Cette thèse aborde le processus ponctuel déterminantal, un modèle probabiliste qui capture la répulsion entre les points d’un certain espace. Celle-ci est déterminée par une matrice de similarité, la matrice noyau du processus, qui spécifie quels points sont les plus similaires et donc moins susceptibles de figurer dans un même sous-ensemble. Contrairement à la sélection aléatoire uniforme, ce processus ponctuel privilégie les sous-ensembles qui contiennent des points diversifiés et hétérogènes. La notion de diversité acquiert une importante grandissante au sein de sciences comme la médecine, la sociologie, les sciences forensiques et les sciences comportementales. Le processus ponctuel déterminantal offre donc une alternative aux traditionnelles méthodes d’échantillonnage en tenant compte de la diversité des éléments choisis. Actuellement, il est déjà très utilisé en apprentissage automatique comme modèle de sélection de sous-ensembles. Son application en statistique est illustrée par trois articles. Le premier article aborde le partitionnement de données effectué par un algorithme répété un grand nombre de fois sur les mêmes données, le partitionnement par consensus. On montre qu’en utilisant le processus ponctuel déterminantal pour sélectionner les points initiaux de l’algorithme, la partition de données finale a une qualité supérieure à celle que l’on obtient en sélectionnant les points de façon uniforme. Le deuxième article étend la méthodologie du premier article aux données ayant un grand nombre d’observations. Ce cas impose un effort computationnel additionnel, étant donné que la sélection de points par le processus ponctuel déterminantal passe par la décomposition spectrale de la matrice de similarité qui, dans ce cas-ci, est de grande taille. On présente deux approches différentes pour résoudre ce problème. On montre que les résultats obtenus par ces deux approches sont meilleurs que ceux obtenus avec un partitionnement de données basé sur une sélection uniforme de points. Le troisième article présente le problème de sélection de variables en régression linéaire et logistique face à un nombre élevé de covariables par une approche bayésienne. La sélection de variables est faite en recourant aux méthodes de Monte Carlo par chaînes de Markov, en utilisant l’algorithme de Metropolis-Hastings. On montre qu’en choisissant le processus ponctuel déterminantal comme loi a priori de l’espace des modèles, le sous-ensemble final de variables est meilleur que celui que l’on obtient avec une loi a priori uniforme.
This thesis presents the determinantal point process, a probabilistic model that captures repulsion between points of a certain space. This repulsion is encompassed by a similarity matrix, the kernel matrix, which selects which points are more similar and then less likely to appear in the same subset. This point process gives more weight to subsets characterized by a larger diversity of its elements, which is not the case with the traditional uniform random sampling. Diversity has become a key concept in domains such as medicine, sociology, forensic sciences and behavioral sciences. The determinantal point process is considered a promising alternative to traditional sampling methods, since it takes into account the diversity of selected elements. It is already actively used in machine learning as a subset selection method. Its application in statistics is illustrated with three papers. The first paper presents the consensus clustering, which consists in running a clustering algorithm on the same data, a large number of times. To sample the initials points of the algorithm, we propose the determinantal point process as a sampling method instead of a uniform random sampling and show that the former option produces better clustering results. The second paper extends the methodology developed in the first paper to large-data. Such datasets impose a computational burden since sampling with the determinantal point process is based on the spectral decomposition of the large kernel matrix. We introduce two methods to deal with this issue. These methods also produce better clustering results than consensus clustering based on a uniform sampling of initial points. The third paper addresses the problem of variable selection for the linear model and the logistic regression, when the number of predictors is large. A Bayesian approach is adopted, using Markov Chain Monte Carlo methods with Metropolis-Hasting algorithm. We show that setting the determinantal point process as the prior distribution for the model space selects a better final model than the model selected by a uniform prior on the model space.
APA, Harvard, Vancouver, ISO, and other styles
30

Tang, X., Qichun Zhang, X. Dai, and Y. Zou. "Neural membrane mutual coupling characterisation using entropy-based iterative learning identification." 2020. http://hdl.handle.net/10454/18180.

Full text
Abstract:
Yes
This paper investigates the interaction phenomena of the coupled axons while the mutual coupling factor is presented as a pairwise description. Based on the Hodgkin-Huxley model and the coupling factor matrix, the membrane potentials of the coupled myelinated/unmyelinated axons are quantified which implies that the neural coupling can be characterised by the presented coupling factor. Meanwhile the equivalent electric circuit is supplied to illustrate the physical meaning of this extended model. In order to estimate the coupling factor, a data-based iterative learning identification algorithm is presented where the Rényi entropy of the estimation error has been minimised. The convergence of the presented algorithm is analysed and the learning rate is designed. To verified the presented model and the algorithm, the numerical simulation results indicate the correctness and the effectiveness. Furthermore, the statistical description of the neural coupling, the approximation using ordinary differential equation, the measurement and the conduction of the nerve signals are discussed respectively as advanced topics. The novelties can be summarised as follows: 1) the Hodgkin-Huxley model has been extended considering the mutual interaction between the neural axon membranes, 2) the iterative learning approach has been developed for factor identification using entropy criterion, and 3) the theoretical framework has been established for this class of system identification problems with convergence analysis.
This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 51807010, and in part by the Natural Science Foundation of Hunan under Grant 1541 and Grant 1734.
Research Development Fund Publication Prize Award winner, Nov 2020.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography