Um die anderen Arten von Veröffentlichungen zu diesem Thema anzuzeigen, folgen Sie diesem Link: A priori data.

Dissertationen zum Thema „A priori data“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit Top-50 Dissertationen für die Forschung zum Thema "A priori data" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Sehen Sie die Dissertationen für verschiedene Spezialgebieten durch und erstellen Sie Ihre Bibliographie auf korrekte Weise.

1

SALAS, PERCY ENRIQUE RIVERA. „STDTRIP: AN A PRIORI DESIGN PROCESS FOR PUBLISHING LINKED DATA“. PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2011. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=28907@1.

Der volle Inhalt der Quelle
Annotation:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
A abordagem de Dados Abertos tem como objetivo promover a interoperabilidade de dados na Web. Consiste na publicação de informações em formatos que permitam seu compartilhamento, descoberta, manipulação e acesso por parte de usuários e outros aplicativos de software. Essa abordagem requer a triplificação de conjuntos de dados, ou seja, a conversão do esquema de bases de dados relacionais, bem como suas instâncias, em triplas RDF. Uma questão fundamental neste processo é decidir a forma de representar conceitos de esquema de banco de dados em termos de classes e propriedades RDF. Isto é realizado através do mapeamento das entidades e relacionamentos para um ou mais vocabulários RDF, usados como base para a geração das triplas. A construção destes vocabulários é extremamente importante, porque quanto mais padrões são utilizados, melhor o grau de interoperabilidade com outros conjuntos de dados. No entanto, as ferramentas disponíveis atualmente não oferecem suporte adequado ao reuso de vocabulários RDF padrão no processo de triplificação. Neste trabalho, apresentamos o processo StdTrip, que guia usuários no processo de triplificação, promovendo o reuso de vocabulários de forma a assegurar interoperabilidade dentro do espaço da Linked Open Data (LOD).
Open Data is a new approach to promote interoperability of data in the Web. It consists in the publication of information produced, archived and distributed by organizations in formats that allow it to be shared, discovered, accessed and easily manipulated by third party consumers. This approach requires the triplification of datasets, i.e., the conversion of database schemata and their instances to a set of RDF triples. A key issue in this process is deciding how to represent database schema concepts in terms of RDF classes and properties. This is done by mapping database concepts to an RDF vocabulary, used as the base for generating the triples. The construction of this vocabulary is extremely important, because the more standards are reused, the easier it will be to interlink the result to other existing datasets. However, tools available today do not support reuse of standard vocabularies in the triplification process, but rather create new vocabularies. In this thesis, we present the StdTrip process that guides users in the triplification process, while promoting the reuse of standard, RDF vocabularies.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Egidi, Leonardo. „Developments in Bayesian Hierarchical Models and Prior Specification with Application to Analysis of Soccer Data“. Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3427270.

Der volle Inhalt der Quelle
Annotation:
In the recent years the challenge for new prior specifications and for complex hierarchical models became even more relevant in Bayesian inference. The advent of the Markov Chain Monte Carlo techniques, along with new probabilistic programming languages and new algorithms, extended the boundaries of the field, both in theoretical and applied directions. In the present thesis, we address theoretical and applied tasks. In the first part we propose a new class of prior distributions which might depend on the data and specified as a mixture between a noninformative and an informative prior. The generic prior belonging to this class provides less information than an informative prior and is more likely to not dominate the inference when the data size is small or moderate. Such a distribution is well suited for robustness tasks, especially in case of informative prior misspecification. Simulation studies within the conjugate models show that this proposal may be convenient for reducing the mean squared errors and improving the frequentist coverage. Furthermore, under mild conditions this class of distributions yields some other nice theoretical properties. In the second part of the thesis we use hierarchical Bayesian models for predicting some soccer quantities and we extend the usual match goals’ modeling strategy by including the bookmakers’ information directly in the model. Posterior predictive checks on in-sample and out-of sample data show an excellent model fit, a good model calibration and, ultimately, the possibility for building efficient betting strategies.
Negli ultimi anni la sfida per la specificazione di nuove distribuzioni a priori e per l’uso di complessi modelli gerarchici è diventata ancora più rilevante all’interno dell’inferenza Bayesiana. L’avvento delle tecniche Markov Chain Monte Carlo, insieme a nuovi linguaggi di programmazione probabilistici, ha esteso i confini del campo, sia in direzione teorica che applicata. Nella presente tesi ci dedichiamo a obiettivi teorici e applicati. Nella prima parte proponiamo una nuova classe di distribuzioni a priori che dipendono dai dati e che sono specificate tramite una mistura tra una a priori non informativa e una a priori informativa. La generica distribuzione appartenente a questa nuova classe fornisce meno informazione di una priori informativa e si candida a non dominare le conclusioni inferenziali quando la dimensione campionaria è piccola o moderata. Tale distribuzione `e idonea per scopi di robustezza, specialmente in caso di scorretta specificazione della distribuzione a priori informativa. Alcuni studi di simulazione all’interno di modelli coniugati mostrano che questa proposta può essere conveniente per ridurre gli errori quadratici medi e per migliorare la copertura frequentista. Inoltre, sotto condizioni non restrittive, questa classe di distribuzioni d`a luogo ad alcune altre interessanti proprietà teoriche. Nella seconda parte della tesi usiamo la classe dei modelli gerarchici Bayesiani per prevedere alcune grandezze relative al gioco del calcio ed estendiamo l’usuale modellazione per i goal includendo nel modello un’ulteriore informazione proveniente dalle case di scommesse. Strumenti per sondare a posteriori la bontà di adattamento del modello ai dati mettono in luce un’ottima aderenza del modello ai dati in possesso, una buona calibrazione dello stesso e suggeriscono, infine, la costruzione di efficienti strategie di scommesse per dati futuri.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Tan, Rong Kun Jason. „Scalable Data-agnostic Processing Model with a Priori Scheduling for the Cloud“. Thesis, Curtin University, 2019. http://hdl.handle.net/20.500.11937/75449.

Der volle Inhalt der Quelle
Annotation:
Cloud computing is identified to be a promising solution to performing big data analytics. However, the maximization of cloud utilization incorporated with optimizing intranode, internode, and memory management is still an open-ended challenge. This thesis presents a novel resource allocation model for cloud to load-balance data-agnostic tasks, minimizing intranode and internode delays, and decreasing memory consumption where these processes are involved in big data analytics. In conclusion, the proposed model outperforms existing techniques.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Bussy, Victor. „Integration of a priori data to optimise industrial X-ray tomographic reconstruction“. Electronic Thesis or Diss., Lyon, INSA, 2024. http://www.theses.fr/2024ISAL0116.

Der volle Inhalt der Quelle
Annotation:
Cette thèse explore des sujets de recherche dans le domaine du contrôle non destructif industriel par rayons X (CND). L’application de la tomographie CT s’est considérablement étendue et son utilisation s'est intensifiée dans de nombreux secteurs industriels. En raison des exigences croissantes et des contraintes sur les processus de contrôle, la CT se doit de constamment évoluer et s'adapter. Que ce soit en termes de qualité de reconstruction ou en temps d’inspection, la tomographie par rayons X est en constante progression, notamment dans ce qu’on appelle la stratégie de vues éparses. Cette stratégie consiste à reconstruire un objet en utilisant le minimum possible de projections radiologiques tout en maintenant une qualité de reconstruction satisfaisante. Cette approche réduit les temps d'acquisition et les coûts associés. La reconstruction en vues éparses constitue un véritable défi car le problème tomographique est mal conditionné, on le dit mal posé. De nombreuses techniques ont été développées pour surmonter cet obstacle, dont plusieurs sont basées sur l'utilisation d'informations a priori lors du processus de reconstruction. En exploitant les données et les connaissances disponibles avant l'expérience, il est possible d'améliorer le résultat de la reconstruction malgré le nombre réduit de projections. Dans notre contexte industriel, par exemple, le modèle de conception assistée par ordinateur (CAO) de l’objet est souvent disponible, ce qui représente une information précieuse sur la géométrie de l’objet étudié. Néanmoins, il est important de noter que le modèle CAO ne fournit qu’une représentation approximative de l'objet. En CND ou en métrologie, ce sont précisément les différences entre un objet et son modèle CAO qui sont d'intérêt. Par conséquent, l'intégration d'informations a priori est complexe car ces informations sont souvent "approximatives" et ne peuvent pas être utilisées telles quelles. Nous proposons plutôt d’utiliser judicieusement les informations géométriques disponibles à partir du modèle CAO à chaque étape du processus. Nous ne proposons donc pas une méthode, mais une méthodologie pour l'intégration des informations géométriques a priori lors la reconstruction tomographique par rayons X
This thesis explores research topics in the field of industrial non-destructive testing (NDT) using X-rays. The application of CT tomography has significantly expanded, and its use has intensified across many industrial sectors. Due to increasing demands and constraints on inspection processes, CT must continually evolve and adapt. Whether in terms of reconstruction quality or inspection time, X-ray tomography is constantly progressing, particularly in the so-called sparse-view strategy. This strategy involves reconstructing an object using the minimum possible number of radiographic projections while maintaining satisfactory reconstruction quality. This approach reduces acquisition times and associated costs. Sparse-view reconstruction poses a significant challenge as the tomographic problem is ill-conditioned, or, as it is often described, ill-posed. Numerous techniques have been developed to overcome this obstacle, many of which rely on leveraging prior information during the reconstruction process. By exploiting data and knowledge available before the experiment, it is possible to improve reconstruction results despite the reduced number of projections. In our industrial context, for example, the computer-aided design (CAD) model of the object is often available, which provides valuable information about the geometry of the object under study. However, it is important to note that the CAD model only offers an approximate representation of the object. In NDT or metrology, it is precisely the differences between an object and its CAD model that are of interest. Therefore, integrating prior information is complex, as this information is often "approximate" and cannot be used as is. Instead, we propose to judiciously use the geometric information available from the CAD model at each step of the process. We do not propose a single method but rather a methodology for integrating prior geometric information during X-ray tomographic reconstruction
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Skrede, Ole-Johan. „Explicit, A Priori Constrained Model Parameterization for Inverse Problems, Applied on Geophysical CSEM Data“. Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for matematiske fag, 2014. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-27343.

Der volle Inhalt der Quelle
Annotation:
This thesis introduce a new parameterization of the model space in global inversion problems. The parameterization provides an explicit representation of the model space with a basis constrained on a priori information about the problem at hand. It is able to represent complex model structures with few parameters, and thereby enhancing the speed of the inversion, as the number of iterations needed to converge is heavily scaled with the number of parameters in stochastic, global inversion methods. A standard Simulated Annealing optimization routine is implemented, and further extended to be able to optimize for a dynamically varying number of variables. The method is applied on inversion of marine CSEM data, and inverts both synthetic and real data sets and is able to recover resistivity profiles that demonstrate good resemblance with provided well bore log data. The trans-dimensional, self-parameterizing Simulated Annealing algorithm which is introduced in this thesis proves to be superior to the regular algorithm with fixed parameter dimensions.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Kindlund, Andrée. „Inversion of SkyTEM Data Based on Geophysical Logging Results for Groundwater Exploration in Örebro, Sweden“. Thesis, Luleå tekniska universitet, Geovetenskap och miljöteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-85315.

Der volle Inhalt der Quelle
Annotation:
Declining groundwater tables threatens several municipalities in Sweden which drinking water is collected from. To ensure a sound drinking water supply, the Geological Survey of Sweden has initiated a groundwater exploration plan. Airborne electromagnetic measurements have seen an uprise in hydrogeophysical projects and have a great potential to achieve high-quality models, especially when combined with drilling data. In 2018, the Geological Survey of Sweden conducted an airborne electromagnetic survey, using the SkyTEM system, in the outskirts of Örebro, Sweden. SkyTEM is a time-domain system and is the most favoured system in hydrogeophysical investigations and was developed especially with hydrogeophysical applications in mind. It is unique by being able to measure interleaved low and high moment current pulses which enables for both high resolution close to surface and increased depth of investigation. During 2019, further drilling in the area including both lithological, and geophysical logging including natural gamma and normal resistivity were carried out. High natural gamma radiation typically indicates content of clay in the rocks. The geology in the area is well explored since the 1940’s when oil was extracted from alum shale in Kvarntorp, located in the survey area. Rocks of sedimentary origin reaches approximately 80 m down until contact with bedrock. Well preserved layers of limestone, shale, alum shale and sandstone are common throughout the area. Combining SkyTEM data with borehole data increases the confidence and generates a model better reflecting the geology in the area. The AarhusInv inversion code was used to perform the modelling, developed by the HydroGeophysical Group (HGG) at Aarhus University, Denmark. Four different models along one single line were generated by using 3, 4, 6 and 30 layers for the reference model in the inversion. Horizontal constraints were applied to all models. Vertical constraints were only applied to the 30 layer model. The survey flight altitude is considered high and in combination with removal of data points being affected by noise, the maximum number of layers in the final model is limited to three. This suggests that the 3 layer model is the most representative model for this survey. The conductive shale seen in the geophysical log is visible in all models at a depth of roughly 40-60 m which is consistent with the geophysical log. No information is retrieved below the shale which concludes that the contact between the sandstone and crystalline rock is not resolved. The lack of information below a highly conductive structure is expected due to shielding effects. This study recommend to carefully assess the flight altitude at quality-control analysis during survey design.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Beretta, Valentina. „évaluation de la véracité des données : améliorer la découverte de la vérité en utilisant des connaissances a priori“. Thesis, IMT Mines Alès, 2018. http://www.theses.fr/2018EMAL0002/document.

Der volle Inhalt der Quelle
Annotation:
Face au danger de la désinformation et de la prolifération de fake news (fausses nouvelles) sur le Web, la notion de véracité des données constitue un enjeu crucial. Dans ce contexte, il devient essentiel de développer des modèles qui évaluent de manière automatique la véracité des informations. De fait, cette évaluation est déjà très difficile pour un humain, en raison notamment du biais de confirmation qui empêche d’évaluer objectivement la fiabilité des informations. De plus, la quantité d'informations disponibles sur le Web rend cette tâche quasiment impossible. Il est donc nécessaire de disposer d'une grande puissance de calcul et de développer des méthodes capables d'automatiser cette tâche.Dans cette thèse, nous nous concentrons sur les modèles de découverte de la vérité. Ces approches analysent les assertions émises par différentes sources afin de déterminer celle qui est la plus fiable et digne de confiance. Cette étape est cruciale dans un processus d'extraction de connaissances, par exemple, pour constituer des bases de qualité, sur lesquelles pourront s'appuyer différents traitements ultérieurs (aide à la décision, recommandation, raisonnement…). Plus précisément, les modèles de la littérature sont des modèles non supervisés qui reposent sur un postulat : les informations exactes sont principalement fournies par des sources fiables et des sources fiables fournissent des informations exactes.Les approches existantes faisaient jusqu'ici abstraction de la connaissance a priori d'un domaine. Dans cette contribution, nous montrons comment les modèles de connaissance (ontologies de domaine) peuvent avantageusement être exploités pour améliorer les processus de recherche de vérité. Nous insistons principalement sur deux approches : la prise en compte de la hiérarchisation des concepts de l'ontologie et l'identification de motifs dans les connaissances qui permet, en exploitant certaines règles d'association, de renforcer la confiance dans certaines assertions. Dans le premier cas, deux valeurs différentes ne seront plus nécessairement considérées comme contradictoires ; elles peuvent, en effet, représenter le même concept mais avec des niveaux de détail différents. Pour intégrer cette composante dans les approches existantes, nous nous basons sur les modèles mathématiques associés aux ordres partiels. Dans le second cas, nous considérons des modèles récurrents (modélisés en utilisant des règles d'association) qui peuvent être dérivés à partir des ontologies et de bases de connaissances existantes. Ces informations supplémentaires peuvent renforcer la confiance dans certaines valeurs lorsque certains schémas récurrents sont observés. Chaque approche est validée sur différents jeux de données qui sont rendus disponibles à la communauté, tout comme le code de calcul correspondant aux deux approches
The notion of data veracity is increasingly getting attention due to the problem of misinformation and fake news. With more and more published online information it is becoming essential to develop models that automatically evaluate information veracity. Indeed, the task of evaluating data veracity is very difficult for humans. They are affected by confirmation bias that prevents them to objectively evaluate the information reliability. Moreover, the amount of information that is available nowadays makes this task time-consuming. The computational power of computer is required. It is critical to develop methods that are able to automate this task.In this thesis we focus on Truth Discovery models. These approaches address the data veracity problem when conflicting values about the same properties of real-world entities are provided by multiple sources.They aim to identify which are the true claims among the set of conflicting ones. More precisely, they are unsupervised models that are based on the rationale stating that true information is provided by reliable sources and reliable sources provide true information. The main contribution of this thesis consists in improving Truth Discovery models considering a priori knowledge expressed in ontologies. This knowledge may facilitate the identification of true claims. Two particular aspects of ontologies are considered. First of all, we explore the semantic dependencies that may exist among different values, i.e. the ordering of values through certain conceptual relationships. Indeed, two different values are not necessary conflicting. They may represent the same concept, but with different levels of detail. In order to integrate this kind of knowledge into existing approaches, we use the mathematical models of partial order. Then, we consider recurrent patterns that can be derived from ontologies. This additional information indeed reinforces the confidence in certain values when certain recurrent patterns are observed. In this case, we model recurrent patterns using rules. Experiments that were conducted both on synthetic and real-world datasets show that a priori knowledge enhances existing models and paves the way towards a more reliable information world. Source code as well as synthetic and real-world datasets are freely available
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Denaxas, Spiridon Christoforos. „A novel framework for integrating a priori domain knowledge into traditional data analysis in the context of bioinformatics“. Thesis, University of Manchester, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.492124.

Der volle Inhalt der Quelle
Annotation:
Recent advances in experimental technology have given scientists the ability to perform large-scale multidimensional experiments involving large data sets. As a direct implication, the amount of data that is being generated is rising in an exponential manner. However, in order to fully scrutinize and comprehend the results obtained from traditional data analysis approaches, it has been proven that a priori domain knowledge must be taken into consideration. Infusing existing knowledge into data analysis operations however is a non-trivial task which presents a number of challenges. This research is concerned into utilizing a structured ontology representing the individual elements composing such large data sets for assessing the results obtained. More specifically, statistical natural language processing and information retrieval methodologies are used in order to provide a seamless integration of existing domain knowledge in the context of cluster analysis experiments on gene product expression patterns. The aim of this research is to produce a framework for integrating a priori domain knowledge into traditional data analysis approaches. This is done in the context of DNA microarrays and gene expression experiments. The value added by the framework to the existing body of research is twofold. First, the framework provides a figure of merit score for assessing and quantifying the biological relatedness between individual gene products. Second, it proposes a mechanism for evaluating the results of data clustering algorithms from a biological point of view.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Katragadda, Mohit. „Development of flame surface density closure for turbulent premixed flames based on a priori analysis of direct numerical simulation data“. Thesis, University of Newcastle upon Tyne, 2013. http://hdl.handle.net/10443/2195.

Der volle Inhalt der Quelle
Annotation:
In turbulent premixed flames the modelling of mean or filtered reac tion rate translates to the closure of flame surface to volume ratio, which is commonly referred to as the Flame Surface Density(FSD). The FSD based reaction rate closure is well established in the context of Reynolds Averaged Navier-Stokes (RANS) simulations for unity Lewis numbers. However, models for FSD in context of Large Eddy Simulations (LES) are relatively rare. In this study three-dimensional Direct Numerical Simulations (DNS) of freely propagating statistically planar premixed flames encompassing a range of different turbulent Reynolds numbers and global Lewis numbers was used. The variation of turbulent Reynolds number has been brought about by modifying the Karlovitz and the Damkohler numbers independently of each other. The DNS data has been explicitly Reynolds averaged and LES filtered for a prior assessment of existing FSD models and for the purpose of proposing new models where necessary.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Rouault-Pic, Sandrine. „Reconstruction en tomographie locale : introduction d'information à priori basse résolution“. Phd thesis, Université Joseph Fourier (Grenoble), 1996. http://tel.archives-ouvertes.fr/tel-00005016.

Der volle Inhalt der Quelle
Annotation:
Un des objectifs actuel en tomographie est de réduire la dose injectée au patient. Les nouveaux systèmes d'imagerie, intégrant des détecteurs haute résolution de petites tailles ou des sources fortement collimatées permettent ainsi de réduire la dose. Ces dispositifs mettent en avant le problème de reconstruction d'image à partir d'informations locales. Un moyen d'aborder le problème de tomographie locale est d'introduire une information à priori, afin de lever la non-unicité de la solution. Nous proposons donc de compléter les projections locales haute résolution (provenant de systèmes décrits précédemment) par des projections complètes basse résolution, provenant par exemple d'un examen scanner standard. Nous supposons que la mise en correspondance des données a été effectuée, cette partie ne faisant pas l'objet de notre travail. Nous avons dans un premier temps, adapté des méthodes de reconstruction classiques (ART, Gradient conjugué régularisé et Rétroprojection filtrée) au problème local, en introduisant dans le processus de reconstruction l'information à priori. Puis, dans un second temps, nous abordons les méthodes de reconstruction par ondelettes et nous proposons également une adaptation à notre problème. Dans tous les cas, la double résolution apparait également dans l'image reconstruite, avec une résolution plus fine dans la région d'intérêt. Enfin, étant donné le coût élevé des méthodes mises en oeuvre, nous proposons une parallélisation des algorithmes implémentés.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
11

Liczbinski, Celso Antonio. „Classificação de dados imagens em alta dimensionalidade, empregando amostras semi-rotuladas e estimadores para as probabilidades a priori“. reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2007. http://hdl.handle.net/10183/12014.

Der volle Inhalt der Quelle
Annotation:
Em cenas naturais, ocorrem com certa freqüência classes espectralmente muito similares, isto é, os vetores média são muito próximos. Em situações como esta dados de baixa dimensionalidade (LandSat-TM, Spot) não permitem uma classificação acurada da cena. Por outro lado, sabe-se que dados em alta dimensionalidade tornam possível a separação destas classes, desde que as matrizes covariância sejam suficientemente distintas. Neste caso, o problema de natureza prática que surge é o da estimação dos parâmetros que caracterizam a distribuição de cada classe. Na medida em que a dimensionalidade dos dados cresce, aumenta o número de parâmetros a serem estimados, especialmente na matriz covariância. Contudo, é sabido que, no mundo real, a quantidade de amostras de treinamento disponíveis, é freqüentemente muito limitada, ocasionando problemas na estimação dos parâmetros necessários ao classificador, degradando, portanto a acurácia do processo de classificação, na medida em que a dimensionalidade dos dados aumenta. O Efeito de Hughes, como é chamado este fenômeno, já é bem conhecido no meio científico, e estudos vêm sendo realizados com o objetivo de mitigar este efeito. Entre as alternativas propostas com a finalidade de mitigar o Efeito de Hughes, encontram-se as técnicas que utilizam amostras não rotuladas e amostras semi-rotuladas para minimizar o problema do tamanho reduzido das amostras de treinamento. Deste modo, técnicas que utilizam amostras semi-rotuladas, tornamse um tópico interessante de estudo, bem como o comportamento destas técnicas em ambientes de dados de imagens digitais de alta dimensionalidade em sensoriamento remoto, como por exemplo, os dados fornecidos pelo sensor AVIRIS. Neste estudo foi dado prosseguimento à metodologia investigada por Lemos (2003), o qual implementou a utilização de amostras semi-rotuladas para fins de estimação dos parâmetros do classificador Máxima Verossimilhança Gaussiana (MVG). A contribuição do presente trabalho consistiu na inclusão de uma etapa adicional, introduzindo a estimação das probabilidades a priori P( wi) referentes às classes envolvidas para utilização no classificador MVG. Desta forma, utilizando-se funções de decisão mais ajustadas à realidade da cena analisada, obteve-se resultados mais acurados no processo de classificação. Os resultados atestaram que com um número limitado de amostras de treinamento, técnicas que utilizam algoritmos adaptativos, mostram-se eficientes em reduzir o Efeito de Hughes. Apesar deste Efeito, quanto à acurácia, em todos os casos o modelo quadrático mostrou-se eficiente através do algoritmo adaptativo. A conclusão principal desta dissertação é que o método do algoritmo adaptativo é útil no processo de classificação de imagens com dados em alta dimensionalidade e classes com características espectrais muito próximas.
In natural scenes there are some cases in which some of the land-cover classes involved are spectrally very similar, i.e., their first order statistics are nearly identical. In these cases, the more traditional sensor systems such as Landsat-TM and Spot, among others usually result in a thematic image low in accuracy. On the other hand, it is well known that high-dimensional image data allows for the separation of classes that are spectrally very similar, provided that their second-order statistics differ significantly. The classification of high-dimensional image data, however, poses some new problems such as the estimation of the parameters in a parametric classifier. As the data dimensionality increases, so does the number of parameters to be estimated, particularly in the covariance matrix. In real cases, however, the number of training samples available is usually limited preventing therefore a reliable estimation of the parameters required by the classifier. The paucity of training samples results in a low accuracy for the thematic image which becomes more noticeable as the data dimensionality increases. This condition is known as the Hughes Phenomenon. Different approaches to mitigate the Hughes Phenomenon investigated by many authors have been reported in the literature. Among the possible alternatives that have been proposed, the so called semi-labeled samples has shown some promising results in the classification of remote sensing high dimensional image data, such as AVIRIS data. In this dissertation the approach proposed by Lemos (2003) is further investigated to increase the reliability in the estimation of the parameters required by the Gaussian Maximum Likelihood (GML) classifier. In this dissertation, we propose a methodology to estimate the a priory probabilities P( i) required by the GMV classifier. It is expected that a more realistic estimation of the values for the a priory probabilities well help to increase the accuracy of the thematic image produced by the GML classifier. The experiments performed in this study have shown an increase in the accuracy of the thematic image, suggesting the adequacy of the proposed methodology.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
12

Brard, Caroline. „Approche Bayésienne de la survie dans les essais cliniques pour les cancers rares“. Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS474/document.

Der volle Inhalt der Quelle
Annotation:
L'approche Bayésienne permet d’enrichir l'information apportée par l'essai clinique, en intégrant des informations externes à l'essai. De plus, elle permet d’exprimer les résultats directement en termes de probabilité d’un certain effet du traitement, plus informative et interprétable qu’une p-valeur et un intervalle de confiance. Par ailleurs, la réduction fréquente d’une analyse à une interprétation binaire des résultats (significatif ou non) est particulièrement dommageable dans les maladies rares. L’objectif de mon travail était d'explorer la faisabilité, les contraintes et l'apport de l'approche Bayésienne dans les essais cliniques portant sur des cancers rares lorsque le critère principal est censuré. Tout d’abord, une revue de la littérature a confirmé la faible implémentation actuelle des méthodes Bayésiennes dans l'analyse des essais cliniques avec critère de survie.Le second axe de ce travail a porté sur le développement d’un essai Bayésien avec critère de survie, intégrant des données historiques, dans le cadre d’un essai réel portant sur une pathologie rare (ostéosarcome). Le prior intégrait des données historiques individuelles sur le bras contrôle et des données agrégées sur l’effet relatif du traitement. Une large étude de simulations a permis d’évaluer les caractéristiques opératoires du design proposé, de calibrer le modèle, tout en explorant la problématique de la commensurabilité entre les données historiques et actuelles. Enfin, la ré-analyse de trois essais cliniques publiés a permis d’illustrer l'apport de l'approche Bayésienne dans l'expression des résultats et la manière dont cette approche permet d’enrichir l’analyse fréquentiste d’un essai
Bayesian approach augments the information provided by the trial itself by incorporating external information into the trial analysis. In addition, this approach allows the results to be expressed in terms of probability of some treatment effect, which is more informative and interpretable than a p-value and a confidence interval. In addition, the frequent reduction of an analysis to a binary interpretation of the results (significant versus non-significant) is particularly harmful in rare diseases.In this context, the objective of my work was to explore the feasibility, constraints and contribution of the Bayesian approach in clinical trials in rare cancers with a primary censored endpoint. A review of the literature confirmed that the implementation of Bayesian methods is still limited in the analysis of clinical trials with a censored endpoint.In the second part of our work, we developed a Bayesian design, integrating historical data in the setting of a real clinical trial with a survival endpoint in a rare disease (osteosarcoma). The prior incorporated individual historical data on the control arm and aggregate historical data on the relative treatment effect. Through a large simulation study, we evaluated the operating characteristics of the proposed design and calibrated the model while exploring the issue of commensurability between historical and current data. Finally, the re-analysis of three clinical trials allowed us to illustrate the contribution of Bayesian approach to the expression of the results, and how this approach enriches the frequentist analysis of a trial
APA, Harvard, Vancouver, ISO und andere Zitierweisen
13

PERRA, SILVIA. „Objective bayesian variable selection for censored data“. Doctoral thesis, Università degli Studi di Cagliari, 2013. http://hdl.handle.net/11584/266108.

Der volle Inhalt der Quelle
Annotation:
In this thesis we study the problem of selecting a set of regressors when the response variable follows a parametric model (such as Weibull or lognormal) and observations are right censored. Under a Bayesian approach, the most widely used tools are the Bayes Factors (BFs) which are, however, undefined when using improper priors. Some commonly used tools in literature, which solve the problem of indeterminacy in model selection, are the Intrinsic Bayes factor (IBF) and the Fractional Bayes factor (FBF). The two proposals are not actual Bayes factors but it can be shown that they asymptotically tend to actual BFs calculated over particular priors called intrinsic and fractional priors, respectively. Each of them depends on the size of a minimal training sample (MTS) and, in particular, the IBF also depends on the MTSs used. When working with censored data, it is not immediate to define a suitable MTS because the sample space of response variables must be fully explored when drawing MTSs, but only uncensored data are actually relevant to train the improper prior into a proper posterior. In fact, an unweighted MTS consisting only of uncensored data may produce a serious bias in model selection. In order to overcome this problem, a sequential MTS (SMTS) is used, leading to an increase in the number of possible MTSs as each one has random size. This prevents the use of the IBF for exploring large model spaces. In order to decrease the computational cost, while maintaining a behavior comparable to that of the IBF, we provide a suitable definition of the FBF that gives results similar to the ones of the IBF calculated over the SMTSs. We first define the conditional FBF on a fraction proportional to the MTS size and, then, we show that the marginal FBF (mFBF), obtained by averaging the conditional FBFs with respect to the probability distribution of the fraction, is consistent and provides also good results. Next, we recall the definition of intrinsic prior for the case of the IBF and the definition of the fractional prior for the FBF and we calculate them in the case of the exponential model for right censored data. In general, when the censoring mechanism is unknown, it is not possible to obtain these priors. Also another approach to the choice of the MTS, which consists in weighting the MTS by a suitable set of weights, is presented. In fact, we define the Kaplan-Meier minimal training sample (KMMTS) which depends on the Kaplan-Meier estimator of the survival function and which contains only suitable weighted uncensored observations. This new proposal could be useful when the censoring percentage is not very high, and it allows faster computations when the predictive distributions, calculated only over uncensored observations, can be obtained in closed-form. The new methodologies are validated by means of simulation studies and applications to real data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
14

Gay, Antonin. „Pronostic de défaillance basé sur les données pour la prise de décision en maintenance : Exploitation du principe d'augmentation de données avec intégration de connaissances à priori pour faire face aux problématiques du small data set“. Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0059.

Der volle Inhalt der Quelle
Annotation:
Cette thèse CIFRE est un projet commun entre ArcelorMittal et le laboratoire CRAN, dont l'objectif est d'optimiser la prise de décision en maintenance industrielle par l'exploitation des sources d'information disponibles, c'est-à-dire des données et des connaissances industrielles, dans le cadre des contraintes industrielles présentées par le contexte sidérurgique. La stratégie actuelle de maintenance des lignes sidérurgiques est basée sur une maintenance préventive régulière. L'évolution de la maintenance préventive vers une stratégie dynamique se fait par le biais de la maintenance prédictive. La maintenance prédictive a été formalisée au sein du paradigme Prognostics and Health Management (PHM) sous la forme d'un processus en sept étapes. Parmi ces étapes de la PHM, le travail de ce doctorat se concentre sur la prise de décision et le pronostic. En regard de cette maintenance prédictive, le contexte de l'Industrie 4.0 met l'accent sur les approches basées sur les données, qui nécessitent une grande quantité de données que les systèmes industriels ne peuvent pas fournir systématiquement. La première contribution de la thèse consiste donc à proposer une équation permettant de lier les performances du pronostic au nombre d'échantillons d'entraînement disponibles. Cette contribution permet de prédire quelles performances le pronostic pourraient atteindre avec des données supplémentaires dans le cas de petits jeux de données (small datasets). La deuxième contribution de la thèse porte sur l'évaluation et l'analyse des performances de l'augmentation de données appliquée au pronostic sur des petits jeux de données. L'augmentation de données conduit à une amélioration de la performance du pronostic jusqu'à 10%. La troisième contribution de la thèse est l'intégration de connaissances expertes au sein de l'augmentation de données. L'intégration de connaissances statistiques s'avère efficace pour éviter la dégradation des performances causée par l'augmentation de données sous certaines conditions défavorables. Enfin, la quatrième contribution consiste en l'intégration des résultats du pronostic dans la modélisation des coûts de la prise de décision en maintenance et en l'évaluation de l'impact du pronostic sur ce coût. Elle démontre que (i) la mise en œuvre de la maintenance prédictive réduit les coûts de maintenance jusqu'à 18-20% et (ii) l'amélioration de 10% du pronostic peut réduire les coûts de maintenance de 1% supplémentaire
This CIFRE PhD is a joint project between ArcelorMittal and the CRAN laboratory, with theaim to optimize industrial maintenance decision-making through the exploitation of the available sources of information, i.e. industrial data and knowledge, under the industrial constraints presented by the steel-making context. Current maintenance strategy on steel lines is based on regular preventive maintenance. Evolution of preventive maintenance towards a dynamic strategy is done through predictive maintenance. Predictive maintenance has been formalized within the Prognostics and Health Management (PHM) paradigm as a seven steps process. Among these PHM steps, this PhD's work focuses on decision-making and prognostics. The Industry 4.0 context put emphasis on data-driven approaches, which require large amount of data that industrial systems cannot ystematically supply. The first contribution of the PhD consists in proposing an equation to link prognostics performances to the number of available training samples. This contribution allows to predict prognostics performances that could be obtained with additional data when dealing with small datasets. The second contribution of the PhD focuses on evaluating and analyzing the performance of data augmentation when applied to rognostics on small datasets. Data augmentation leads to an improvement of prognostics performance up to 10%. The third contribution of the PhD consists in the integration of expert knowledge into data augmentation. Statistical knowledge integration proved efficient to avoid performance degradation caused by data augmentation under some unfavorable conditions. Finally, the fourth contribution consists in the integration of prognostics in maintenance decision-making cost modeling and the evaluation of prognostics impact on maintenance decision cost. It demonstrates that (i) the implementation of predictive maintenance reduces maintenance cost up to 18-20% and ii) the 10% prognostics improvement can reduce maintenance cost by an additional 1%
APA, Harvard, Vancouver, ISO und andere Zitierweisen
15

Ducros, Florence. „Maintien en conditions opérationnelles pour une flotte de véhicules : étude de la non stabilité des flux de rechange dans le temps“. Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS213/document.

Der volle Inhalt der Quelle
Annotation:
Dans cette thèse, nous proposons une démarche méthodologique permettant de simuler le besoin en équipement de rechange pour une flotte de véhicules. Les systèmes se dégradent avec l’âge ou l’usage, et sont défaillants lorsqu’ils ne remplissent plus leur mission. L’usager a alors besoin d’une assurance que le système soit opérationnel pendant sa durée de vie utile. Un contrat de soutien oblige ainsi l’industriel à remédier à une défaillance et à maintenir le système en condition opérationnelle durant la durée du contrat. Ces dernières années, la mondialisation et l’évolution rapide des technologies obligent les constructeurs à proposer des offres de contrat de maintenance bien au-delà de la vie utile des équipements. La gestion de contrat de soutien ou d’extension de soutien requiert la connaissance de la durée de vie des équipements, mais aussi des conditions d’usages des véhicules, dépendant du client. L’analyse des retours clientèle ou des RetEx est alors un outil important d’aide à la décision pour l’industriel. Cependant ces données ne sont pas homogènes et sont très fortement censurées, ce qui rend les estimations difficiles. La plupart du temps, cette variabilité n’est pas observée mais doit cependant être prise en compte sous peine d’erreur de décision. Nous proposons dans cette thèse de modéliser l’hétérogénéité des durées de vie par un modèle de mélange et de concurrence de deux lois de Weibull. On propose une méthode d’estimation des paramètres capable d’être performante malgré la forte présence de données censurées.Puis, nous faisons appel à une méthode de classification non supervisée afin d’identifier des profils d’utilisation des véhicules. Cela nous permet alors de simuler les besoins en pièces de rechange pour une flotte de véhicules pour la durée du contrat ou pour une extension de contrat
This thesis gathers methodologicals contributions to simulate the need of replacement equipment for a vehile fleet. Systems degrade with age or use, and fail when they do not fulfill their mission. The user needs an assurance that the system is operational during its useful life. A support contract obliges the manufacturer to remedy a failure and to keep the system in operational condition for the duration of the MCO contract.The management of support contracts or the extension of support requires knowledge of the equipment lifetime and also the uses condition of vehicles, which depends on the customer. The analysis of customer returns or RetEx is then an important tool to help support the decision of the industrial. In reliability or warranty analysis, engineers must often deal with lifetimes data that are non-homogeneous. Most of the time, this variability is unobserved but has to be taken into account for reliability or warranty cost analysis.A further problem is that in reliability analysis, the data is heavily censored which makes estimations more difficult. We propose to consider the heterogeneity of lifetimes by a mixture and competition model of two Weibull laws. Unfortunately, the performance of classical estimation methods (maximum of likelihood via EM, Bayes approach via MCMC) is jeopardized due to the high number of parameters and the heavy censoring.To overcome the problem of heavy censoring for Weibull mixture parameters estimation, we propose a Bayesian bootstrap method, called Bayesian RestorationMaximization.We use an unsupervised clustering method to identify the profiles of vehicle uses. Our method allows to simulate the needs of spare parts for a vehicles fleet for the duration of the contract or for a contract extension
APA, Harvard, Vancouver, ISO und andere Zitierweisen
16

Kubalík, Jakub. „Mining of Textual Data from the Web for Speech Recognition“. Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237170.

Der volle Inhalt der Quelle
Annotation:
Prvotním cílem tohoto projektu bylo prostudovat problematiku jazykového modelování pro rozpoznávání řeči a techniky pro získávání textových dat z Webu. Text představuje základní techniky rozpoznávání řeči a detailněji popisuje jazykové modely založené na statistických metodách. Zvláště se práce zabývá kriterii pro vyhodnocení kvality jazykových modelů a systémů pro rozpoznávání řeči. Text dále popisuje modely a techniky dolování dat, zvláště vyhledávání informací. Dále jsou představeny problémy spojené se získávání dat z webu, a v kontrastu s tím je představen vyhledávač Google. Součástí projektu byl návrh a implementace systému pro získávání textu z webu, jehož detailnímu popisu je věnována náležitá pozornost. Nicméně, hlavním cílem práce bylo ověřit, zda data získaná z Webu mohou mít nějaký přínos pro rozpoznávání řeči. Popsané techniky se tak snaží najít optimální způsob, jak data získaná z Webu použít pro zlepšení ukázkových jazykových modelů, ale i modelů nasazených v reálných rozpoznávacích systémech.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
17

Darnieder, William Francis. „Bayesian Methods for Data-Dependent Priors“. The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1306344172.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
18

Walter, Gero. „Generalized Bayesian inference under prior-data conflict“. Diss., Ludwig-Maximilians-Universität München, 2013. http://nbn-resolving.de/urn:nbn:de:bvb:19-170598.

Der volle Inhalt der Quelle
Annotation:
This thesis is concerned with the generalisation of Bayesian inference towards the use of imprecise or interval probability, with a focus on model behaviour in case of prior-data conflict. Bayesian inference is one of the main approaches to statistical inference. It requires to express (subjective) knowledge on the parameter(s) of interest not incorporated in the data by a so-called prior distribution. All inferences are then based on the so-called posterior distribution, the subsumption of prior knowledge and the information in the data calculated via Bayes' Rule. The adequate choice of priors has always been an intensive matter of debate in the Bayesian literature. While a considerable part of the literature is concerned with so-called non-informative priors aiming to eliminate (or, at least, to standardise) the influence of priors on posterior inferences, inclusion of specific prior information into the model may be necessary if data are scarce, or do not contain much information about the parameter(s) of interest; also, shrinkage estimators, common in frequentist approaches, can be considered as Bayesian estimators based on informative priors. When substantial information is used to elicit the prior distribution through, e.g, an expert's assessment, and the sample size is not large enough to eliminate the influence of the prior, prior-data conflict can occur, i.e., information from outlier-free data suggests parameter values which are surprising from the viewpoint of prior information, and it may not be clear whether the prior specifications or the integrity of the data collecting method (the measurement procedure could, e.g., be systematically biased) should be questioned. In any case, such a conflict should be reflected in the posterior, leading to very cautious inferences, and most statisticians would thus expect to observe, e.g., wider credibility intervals for parameters in case of prior-data conflict. However, at least when modelling is based on conjugate priors, prior-data conflict is in most cases completely averaged out, giving a false certainty in posterior inferences. Here, imprecise or interval probability methods offer sound strategies to counter this issue, by mapping parameter uncertainty over sets of priors resp. posteriors instead of over single distributions. This approach is supported by recent research in economics, risk analysis and artificial intelligence, corroborating the multi-dimensional nature of uncertainty and concluding that standard probability theory as founded on Kolmogorov's or de Finetti's framework may be too restrictive, being appropriate only for describing one dimension, namely ideal stochastic phenomena. The thesis studies how to efficiently describe sets of priors in the setting of samples from an exponential family. Models are developed that offer enough flexibility to express a wide range of (partial) prior information, give reasonably cautious inferences in case of prior-data conflict while resulting in more precise inferences when prior and data agree well, and still remain easily tractable in order to be useful for statistical practice. Applications in various areas, e.g. common-cause failure modeling and Bayesian linear regression, are explored, and the developed approach is compared to other imprecise probability models.
Das Thema dieser Dissertation ist die Generalisierung der Bayes-Inferenz durch die Verwendung von unscharfen oder intervallwertigen Wahrscheinlichkeiten. Ein besonderer Fokus liegt dabei auf dem Modellverhalten in dem Fall, dass Vorwissen und beobachtete Daten in Konflikt stehen. Die Bayes-Inferenz ist einer der Hauptansätze zur Herleitung von statistischen Inferenzmethoden. In diesem Ansatz muss (eventuell subjektives) Vorwissen über die Modellparameter in einer sogenannten Priori-Verteilung (kurz: Priori) erfasst werden. Alle Inferenzaussagen basieren dann auf der sogenannten Posteriori-Verteilung (kurz: Posteriori), welche mittels des Satzes von Bayes berechnet wird und das Vorwissen und die Informationen in den Daten zusammenfasst. Wie eine Priori-Verteilung in der Praxis zu wählen sei, ist dabei stark umstritten. Ein großer Teil der Literatur befasst sich mit der Bestimmung von sogenannten nichtinformativen Prioris. Diese zielen darauf ab, den Einfluss der Priori auf die Posteriori zu eliminieren oder zumindest zu standardisieren. Falls jedoch nur wenige Daten zur Verfügung stehen, oder diese nur wenige Informationen in Bezug auf die Modellparameter bereitstellen, kann es hingegen nötig sein, spezifische Priori-Informationen in ein Modell einzubeziehen. Außerdem können sogenannte Shrinkage-Schätzer, die in frequentistischen Ansätzen häufig zum Einsatz kommen, als Bayes-Schätzer mit informativen Prioris angesehen werden. Wenn spezifisches Vorwissen zur Bestimmung einer Priori genutzt wird (beispielsweise durch eine Befragung eines Experten), aber die Stichprobengröße nicht ausreicht, um eine solche informative Priori zu überstimmen, kann sich ein Konflikt zwischen Priori und Daten ergeben. Dieser kann sich darin äußern, dass die beobachtete (und von eventuellen Ausreißern bereinigte) Stichprobe Parameterwerte impliziert, die aus Sicht der Priori äußerst überraschend und unerwartet sind. In solch einem Fall kann es unklar sein, ob eher das Vorwissen oder eher die Validität der Datenerhebung in Zweifel gezogen werden sollen. (Es könnten beispielsweise Messfehler, Kodierfehler oder eine Stichprobenverzerrung durch selection bias vorliegen.) Zweifellos sollte sich ein solcher Konflikt in der Posteriori widerspiegeln und eher vorsichtige Inferenzaussagen nach sich ziehen; die meisten Statistiker würden daher davon ausgehen, dass sich in solchen Fällen breitere Posteriori-Kredibilitätsintervalle für die Modellparameter ergeben. Bei Modellen, die auf der Wahl einer bestimmten parametrischen Form der Priori basieren, welche die Berechnung der Posteriori wesentlich vereinfachen (sogenannte konjugierte Priori-Verteilungen), wird ein solcher Konflikt jedoch einfach ausgemittelt. Dann werden Inferenzaussagen, die auf einer solchen Posteriori basieren, den Anwender in falscher Sicherheit wiegen. In dieser problematischen Situation können Intervallwahrscheinlichkeits-Methoden einen fundierten Ausweg bieten, indem Unsicherheit über die Modellparameter mittels Mengen von Prioris beziehungsweise Posterioris ausgedrückt wird. Neuere Erkenntnisse aus Risikoforschung, Ökonometrie und der Forschung zu künstlicher Intelligenz, die die Existenz von verschiedenen Arten von Unsicherheit nahelegen, unterstützen einen solchen Modellansatz, der auf der Feststellung aufbaut, dass die auf den Ansätzen von Kolmogorov oder de Finetti basierende übliche Wahrscheinlichkeitsrechung zu restriktiv ist, um diesen mehrdimensionalen Charakter von Unsicherheit adäquat einzubeziehen. Tatsächlich kann in diesen Ansätzen nur eine der Dimensionen von Unsicherheit modelliert werden, nämlich die der idealen Stochastizität. In der vorgelegten Dissertation wird untersucht, wie sich Mengen von Prioris für Stichproben aus Exponentialfamilien effizient beschreiben lassen. Wir entwickeln Modelle, die eine ausreichende Flexibilität gewährleisten, sodass eine Vielfalt von Ausprägungen von partiellem Vorwissen beschrieben werden kann. Diese Modelle führen zu vorsichtigen Inferenzaussagen, wenn ein Konflikt zwischen Priori und Daten besteht, und ermöglichen dennoch präzisere Aussagen für den Fall, dass Priori und Daten im Wesentlichen übereinstimmen, ohne dabei die Einsatzmöglichkeiten in der statistischen Praxis durch eine zu hohe Komplexität in der Anwendung zu erschweren. Wir ermitteln die allgemeinen Inferenzeigenschaften dieser Modelle, die sich durch einen klaren und nachvollziehbaren Zusammenhang zwischen Modellunsicherheit und der Präzision von Inferenzaussagen auszeichnen, und untersuchen Anwendungen in verschiedenen Bereichen, unter anderem in sogenannten common-cause-failure-Modellen und in der linearen Bayes-Regression. Zudem werden die in dieser Dissertation entwickelten Modelle mit anderen Intervallwahrscheinlichkeits-Modellen verglichen und deren jeweiligen Stärken und Schwächen diskutiert, insbesondere in Bezug auf die Präzision von Inferenzaussagen bei einem Konflikt von Vorwissen und beobachteten Daten.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
19

Brouwer, Thomas Alexander. „Bayesian matrix factorisation : inference, priors, and data integration“. Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/269921.

Der volle Inhalt der Quelle
Annotation:
In recent years the amount of biological data has increased exponentially. Most of these data can be represented as matrices relating two different entity types, such as drug-target interactions (relating drugs to protein targets), gene expression profiles (relating drugs or cell lines to genes), and drug sensitivity values (relating drugs to cell lines). Not only the size of these datasets is increasing, but also the number of different entity types that they relate. Furthermore, not all values in these datasets are typically observed, and some are very sparse. Matrix factorisation is a popular group of methods that can be used to analyse these matrices. The idea is that each matrix can be decomposed into two or more smaller matrices, such that their product approximates the original one. This factorisation of the data reveals patterns in the matrix, and gives us a lower-dimensional representation. Not only can we use this technique to identify clusters and other biological signals, we can also predict the unobserved entries, allowing us to prune biological experiments. In this thesis we introduce and explore several Bayesian matrix factorisation models, focusing on how to best use them for predicting these missing values in biological datasets. Our main hypothesis is that matrix factorisation methods, and in particular Bayesian variants, are an extremely powerful paradigm for predicting values in biological datasets, as well as other applications, and especially for sparse and noisy data. We demonstrate the competitiveness of these approaches compared to other state-of-the-art methods, and explore the conditions under which they perform the best. We consider several aspects of the Bayesian approach to matrix factorisation. Firstly, the effect of inference approaches that are used to find the factorisation on predictive performance. Secondly, we identify different likelihood and Bayesian prior choices that we can use for these models, and explore when they are most appropriate. Finally, we introduce a Bayesian matrix factorisation model that can be used to integrate multiple biological datasets, and hence improve predictions. This model hybridly combines different matrix factorisation models and Bayesian priors. Through these models and experiments we support our hypothesis and provide novel insights into the best ways to use Bayesian matrix factorisation methods for predictive purposes.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
20

Kaushik, Rituraj. „Data-Efficient Robot Learning using Priors from Simulators“. Electronic Thesis or Diss., Université de Lorraine, 2020. http://www.theses.fr/2020LORR0105.

Der volle Inhalt der Quelle
Annotation:
Quand les robots doivent affronter le monde réel, ils doivent s'adapter à diverses situations imprévues en acquérant de nouvelles compétences le plus rapidement possible. Les algorithmes d'apprentissage par renforcement (par exemple, l'apprentissage par renforcement profond) pourraient permettre d’apprendre de telles compétences, mais les algorithmes actuels nécessitent un temps d'interaction trop important. Dans cette thèse, nous avons exploré des méthodes permettant à un robot d'acquérir de nouvelles compétences par essai-erreur en quelques minutes d'interaction physique. Notre objectif principal est de combiner des connaissances acquises sur un simulateur avec les expériences réelles du robot afin d'obtenir un apprentissage et une adaptation rapides. Dans notre première contribution, nous proposons un nouvel algorithme de recherche de politiques basé sur un modèle, appelé Multi-DEX, qui (1) est capable de trouver des politiques dans des scénarios aux récompenses rares, (2) n'impose aucune contrainte sur le type de politique ou le type de fonction de récompense et (3) est aussi efficace en termes de données que l'algorithme de recherche de politiques de l’état de l’art dans des scénarios de récompenses non rares. Dans notre deuxième contribution, nous proposons un algorithme d'apprentissage en ligne basé sur un répertoire, appelé APROL, qui permet à un robot de s'adapter rapidement à des dommages physiques (par exemple, une patte endommagée) ou à des perturbations environnementales (par exemple, les conditions du terrain) et de résoudre la tâche donnée. Nous montrons qu'APROL surpasse plusieurs lignes de base, y compris l'algorithme d'apprentissage par répertoire RTE (Reset Free Trial and Error), en résolvant les tâches en un temps d'interaction beaucoup plus court que les algorithmes avec lesquels nous l’avons comparé. Dans notre troisième contribution, nous présentons un algorithme de méta-apprentissage basé sur les gradients appelé FAMLE. FAMLE permet d'entraîner le modèle dynamique du robot à partir de données simulées afin que le modèle puisse être adapté rapidement à diverses situations invisibles grâce aux observations du monde réel. En utilisant FAMLE pour améliorer un modèle pour la commande prédictive, nous montrons que notre approche surpasse plusieurs algorithmes d'apprentissage basés ou non sur un modèle, et résout les tâches données en moins de temps d'interaction que les algorithmes avec lesquels nous l’avons comparé
As soon as the robots step out in the real and uncertain world, they have to adapt to various unanticipated situations by acquiring new skills as quickly as possible. Unfortunately, on robots, current state-of-the-art reinforcement learning (e.g., deep-reinforcement learning) algorithms require large interaction time to train a new skill. In this thesis, we have explored methods to allow a robot to acquire new skills through trial-and-error within a few minutes of physical interaction. Our primary focus is to incorporate prior knowledge from a simulator with real-world experiences of a robot to achieve rapid learning and adaptation. In our first contribution, we propose a novel model-based policy search algorithm called Multi-DEX that (1) is capable of finding policies in sparse reward scenarios (2) does not impose any constraints on the type of policy or the type of reward function and (3) is as data-efficient as state-of-the-art model-based policy search algorithm in non-sparse reward scenarios. In our second contribution, we propose a repertoire-based online learning algorithm called APROL which allows a robot to adapt to physical damages (e.g., a damaged leg) or environmental perturbations (e.g., terrain conditions) quickly and solve the given task. In this work, we use several repertoires of policies generated in simulation for a subset of possible situations that the robot might face in real-world. During the online learning, the robot automatically figures out the most suitable repertoire to adapt and control the robot. We show that APROL outperforms several baselines including the current state-of-the-art repertoire-based learning algorithm RTE by solving the tasks in much less interaction times than the baselines. In our third contribution, we introduce a gradient-based meta-learning algorithm called FAMLE. FAMLE meta-trains the dynamical model of the robot from simulated data so that the model can be adapted to various unseen situations quickly with the real-world observations. By using FAMLE with a model-predictive control framework, we show that our approach outperforms several model-based and model-free learning algorithms, and solves the given tasks in less interaction time than the baselines
APA, Harvard, Vancouver, ISO und andere Zitierweisen
21

Fu, Shuai. „Inverse problems occurring in uncertainty analysis“. Thesis, Paris 11, 2012. http://www.theses.fr/2012PA112208/document.

Der volle Inhalt der Quelle
Annotation:
Ce travail de recherche propose une solution aux problèmes inverses probabilistes avec des outils de la statistique bayésienne. Le problème inverse considéré est d'estimer la distribution d'une variable aléatoire non observée X à partir d'observations bruitées Y suivant un modèle physique coûteux H. En général, de tels problèmes inverses sont rencontrés dans le traitement des incertitudes. Le cadre bayésien nous permet de prendre en compte les connaissances préalables d'experts en particulier lorsque peu de données sont disponibles. Un algorithme de Metropolis-Hastings-within-Gibbs est proposé pour approcher la distribution a posteriori des paramètres de X avec un processus d'augmentation des données. A cause d'un nombre élevé d'appels, la fonction coûteuse H est remplacée par un émulateur de krigeage (métamodèle). Cette approche implique plusieurs erreurs de natures différentes et, dans ce travail,nous nous attachons à estimer et réduire l'impact de ces erreurs. Le critère DAC a été proposé pour évaluer la pertinence du plan d'expérience (design) et le choix de la loi apriori, en tenant compte des observations. Une autre contribution est la construction du design adaptatif adapté à notre objectif particulier dans le cadre bayésien. La méthodologie principale présentée dans ce travail a été appliquée à un cas d'étude en ingénierie hydraulique
This thesis provides a probabilistic solution to inverse problems through Bayesian techniques.The inverse problem considered here is to estimate the distribution of a non-observed random variable X from some noisy observed data Y explained by a time-consuming physical model H. In general, such inverse problems are encountered when treating uncertainty in industrial applications. Bayesian inference is favored as it accounts for prior expert knowledge on Xin a small sample size setting. A Metropolis-Hastings-within-Gibbs algorithm is proposed to compute the posterior distribution of the parameters of X through a data augmentation process. Since it requires a high number of calls to the expensive function H, the modelis replaced by a kriging meta-model. This approach involves several errors of different natures and we focus on measuring and reducing the possible impact of those errors. A DAC criterion has been proposed to assess the relevance of the numerical design of experiments and the prior assumption, taking into account the observed data. Another contribution is the construction of adaptive designs of experiments adapted to our particular purpose in the Bayesian framework. The main methodology presented in this thesis has been applied to areal hydraulic engineering case-study
APA, Harvard, Vancouver, ISO und andere Zitierweisen
22

Xue, Xinyu. „Data preservation in intermittently connected sensor network with data priority“. Thesis, Wichita State University, 2013. http://hdl.handle.net/10057/6848.

Der volle Inhalt der Quelle
Annotation:
In intermittently connected sensor networks, the data generated may have different importance and priority. Different types of data will help scientists analyze the physical environment differently. In a challenging environment, wherein sensor nodes are not always connected to the base station with a communication path, and not all data may be preserved in the network. Under such circumstances, due to the severe energy constraint in the sensor nodes and the storage limit, how to preserve data with the maximum priority is a new and challenging problem. In this thesis, we will study how to preserve data to produce the maximum total priority under the constraints of the limited energy level and storage capacity of each sensor node. We have designed an efficient optimal algorithm, and prove it is optimal. The core of the problem is Maximum weighted flow problems, which is in order to maximize the total weight of the flow network, taking into account the different flows having different weights. The maximum weighted flow is a generalization of the classical maximum flow problem, characterized in that each unit of flow has the same weight. To the best of our knowledge, our work first study and solve the maximum weighted flow problems. We also propose a more efficient heuristic algorithm. Through simulation, we show that it performs comparably to the optimal algorithm, and perform better than the classical maximum flow algorithm, which does not consider the data priority. Finally, we design a distributed data preservation algorithm based on the push-relabel algorithm and analyze its time and message complexity, experience has shown that it is superior to push-relabel distributed maximum flow algorithm according to total preserved priorities.
Thesis (M.S.)--Wichita State University, College of Engineering, Dept. of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO und andere Zitierweisen
23

Duan, Yuyan. „A Modified Bayesian Power Prior Approach with Applications in Water Quality Evaluation“. Diss., Virginia Tech, 2005. http://hdl.handle.net/10919/29976.

Der volle Inhalt der Quelle
Annotation:
This research is motivated by an issue frequently encountered in environmental water quality evaluation. Many times, the sample size of water monitoring data is too small to have adequate power. Here, we present a Bayesian power prior approach by incorporating the current data and historical data and/or the data collected at neighboring stations to make stronger statistical inferences on the parameters of interest. The elicitation of power prior distributions is based on the availability of historical data, and is realized by raising the likelihood function of the historical data to a fractional power. The power prior Bayesian analysis has been proven to be a useful class of informative priors in Bayesian inference. In this dissertation, we propose a modified approach to constructing the joint power prior distribution for the parameter of interest and the power parameter. The power parameter, in this modified approach, quantifies the heterogeneity between current and historical data automatically, and hence controls the influence of historical data on the current study in a sensible way. In addition, the modified power prior needs little to ensure its propriety. The properties of the modified power prior and its posterior distribution are examined for the Bernoulli and normal populations. The modified and the original power prior approaches are compared empirically in terms of the mean squared error (MSE) of parameter estimates as well as the behavior of the power parameter. Furthermore, the extension of the modified power prior to multiple historical data sets is discussed, followed by its comparison with the random effects model. Several sets of water quality data are studied in this dissertation to illustrate the implementation of the modified power prior approach with normal and Bernoulli models. Since the power prior method uses information from sources other than current data, it has advantages in terms of power and estimation precision for decisions with small sample sizes, relative to methods that ignore prior information.
Ph. D.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
24

Khalili, K. „Enhancing vision data using prior knowledge for assembly applications“. Thesis, University of Salford, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.360432.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
25

Razzaq, Misbah. „Integrating phosphoproteomic time series data into prior knowledge networks“. Thesis, Ecole centrale de Nantes, 2018. http://www.theses.fr/2018ECDN0048/document.

Der volle Inhalt der Quelle
Annotation:
Les voies de signalisation canoniques traditionnelles aident à comprendre l'ensemble des processus de signalisation à l'intérieur de la cellule. Les données phosphoprotéomiques à grande échelle donnent un aperçu des altérations entre différentes protéines dans différents contextes expérimentaux. Notre objectif est de combiner les réseaux de signalisation traditionnels avec des données de séries temporelles phosphoprotéomiques complexes afin de démêler les réseaux de signalisation spécifiques aux cellules. Côté application, nous appliquons et améliorons une méthode de séries temporelles caspo conçue pour intégrer des données phosphoprotéomiques de séries temporelles dans des réseaux de signalisation de protéines. Nous utilisons une étude de cas réel à grande échelle tirée du défi HPN-DREAM BreastCancer. Nous déduisons une famille de modèles booléens à partir de données de séries temporelles de perturbations multiples de quatre lignées cellulaires de cancer du sein, compte tenu d'un réseau de signalisation protéique antérieur. Les résultats obtenus sont comparables aux équipes les plus performantes du challenge HPN-DREAM. Nous avons découvert que les modèles similaires sont regroupés dans l'espace de solutions. Du côté informatique, nous avons amélioré la méthode pour découvrir diverses solutions et améliorer le temps de calcul
Traditional canonical signaling pathways help to understand overall signaling processes inside the cell. Large scale phosphoproteomic data provide insight into alterations among different proteins under different experimental settings. Our goal is to combine the traditional signaling networks with complex phosphoproteomic time-series data in order to unravel cell specific signaling networks. On the application side, we apply and improve a caspo time series method conceived to integrate time series phosphoproteomic data into protein signaling networks. We use a large-scale real case study from the HPN-DREAM BreastCancer challenge. We infer a family of Boolean models from multiple perturbation time series data of four breast cancer cell lines given a prior protein signaling network. The obtained results are comparable to the top performing teams of the HPN-DREAM challenge. We also discovered that the similar models are clustered to getherin the solutions space. On the computational side, we improved the method to discover diverse solutions and improve the computational time
APA, Harvard, Vancouver, ISO und andere Zitierweisen
26

Lindlöf, Angelica. „Deriving Genetic Networks from Gene Expression Data and Prior Knowledge“. Thesis, University of Skövde, Department of Computer Science, 2001. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-589.

Der volle Inhalt der Quelle
Annotation:

In this work three different approaches for deriving genetic association networks were tested. The three approaches were Pearson correlation, an algorithm based on the Boolean network approach and prior knowledge. Pearson correlation and the algorithm based on the Boolean network approach derived associations from gene expression data. In the third approach, prior knowledge from a known genetic network of a related organism was used to derive associations for the target organism, by using homolog matching and mapping the known genetic network to the related organism. The results indicate that the Pearson correlation approach gave the best results, but the prior knowledge approach seems to be the one most worth pursuing

APA, Harvard, Vancouver, ISO und andere Zitierweisen
27

Hira, Zena Maria. „Dimensionality reduction methods for microarray cancer data using prior knowledge“. Thesis, Imperial College London, 2016. http://hdl.handle.net/10044/1/33812.

Der volle Inhalt der Quelle
Annotation:
Microarray studies are currently a very popular source of biological information. They allow the simultaneous measurement of hundreds of thousands of genes, drastically increasing the amount of data that can be gathered in a small amount of time and also decreasing the cost of producing such results. Large numbers of high dimensional data sets are currently being generated and there is an ongoing need to find ways to analyse them to obtain meaningful interpretations. Many microarray experiments are concerned with answering specific biological or medical questions regarding diseases and treatments. Cancer is one of the most popular research areas and there is a plethora of data available requiring in depth analysis. Although the analysis of microarray data has been thoroughly researched over the past ten years, new approaches still appear regularly, and may lead to a better understanding of the available information. The size of the modern data sets presents considerable difficulties to traditional methodologies based on hypothesis testing, and there is a new move towards the use of machine learning in microarray data analysis. Two new methods of using prior genetic knowledge in machine learning algorithms have been developed and their results are compared with existing methods. The prior knowledge consists of biological pathway data that can be found in on-line databases, and gene ontology terms. The first method, called ''a priori manifold learning'' uses the prior knowledge when constructing a manifold for non-linear feature extraction. It was found to perform better than both linear principal components analysis (PCA) and the non-linear Isomap algorithm (without prior knowledge) in both classification accuracy and quality of the clusters. Both pathway and GO terms were used as prior knowledge, and results showed that using GO terms can make the models over-fit the data. In the cases where the use of GO terms does not over-fit, the results are better than PCA, Isomap and a priori manifold learning using pathways. The second method, called ''the feature selection over pathway segmentation algorithm'', uses the pathway information to split a big dataset into smaller ones. Then, using AdaBoost, decision trees are constructed for each of the smaller sets and the sets that achieve higher classification accuracy are identified. The individual genes in these subsets are assessed to determine their role in the classification process. Using data sets concerning chronic myeloid leukaemia (CML) two subsets based on pathways were found to be strongly associated with the response to treatment. Using a different data set from measurements on lower grade glioma (LGG) tumours, four informative gene sets were discovered. Further analysis based on the Gini importance measure identified a set of genes for each cancer type (CML, LGG) that could predict the response to treatment very accurately (> 90%). Moreover a single gene that can predict the response to CML treatment accurately was identified.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
28

Patil, Vivek. „Criteria for Data Consistency Evaluation Prior to Modal Parameter Estimation“. University of Cincinnati / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1627667589352536.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
29

Porter, Erica May. „Applying an Intrinsic Conditional Autoregressive Reference Prior for Areal Data“. Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/91385.

Der volle Inhalt der Quelle
Annotation:
Bayesian hierarchical models are useful for modeling spatial data because they have flexibility to accommodate complicated dependencies that are common to spatial data. In particular, intrinsic conditional autoregressive (ICAR) models are commonly assigned as priors for spatial random effects in hierarchical models for areal data corresponding to spatial partitions of a region. However, selection of prior distributions for these spatial parameters presents a challenge to researchers. We present and describe ref.ICAR, an R package that implements an objective Bayes intrinsic conditional autoregressive prior on a vector of spatial random effects. This model provides an objective Bayesian approach for modeling spatially correlated areal data. ref.ICAR enables analysis of spatial areal data for a specified region, given user-provided data and information about the structure of the study region. The ref.ICAR package performs Markov Chain Monte Carlo (MCMC) sampling and outputs posterior medians, intervals, and trace plots for fixed effect and spatial parameters. Finally, the functions provide regional summaries, including medians and credible intervals for fitted values by subregion.
Master of Science
Spatial data is increasingly relevant in a wide variety of research areas. Economists, medical researchers, ecologists, and policymakers all make critical decisions about populations using data that naturally display spatial dependence. One such data type is areal data; data collected at county, habitat, or tract levels are often spatially related. Most convenient software platforms provide analyses for independent data, as the introduction of spatial dependence increases the complexity of corresponding models and computation. Use of analyses with an independent data assumption can lead researchers and policymakers to make incorrect, simplistic decisions. Bayesian hierarchical models can be used to effectively model areal data because they have flexibility to accommodate complicated dependencies that are common to spatial data. However, use of hierarchical models increases the number of model parameters and requires specification of prior distributions. We present and describe ref.ICAR, an R package available to researchers that automatically implements an objective Bayesian analysis that is appropriate for areal data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
30

Bogdan, Abigail Marie. „Student Reasoning from Data Tables: Data Interpretation in Light of Student Ability and Prior Belief“. The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1460120122.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
31

Hotti, Alexandra. „Bayesian insurance pricing using informative prior estimation techniques“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286312.

Der volle Inhalt der Quelle
Annotation:
Large, well-established insurance companies build statistical pricing models based on customer claim data. Due to their long experience and large amounts of data, they can predict their future expected claim losses accurately. In contrast, small newly formed insurance start-ups do not have access to such data. Instead, a start-up’s pricing model’s initial parameters can be set by directly estimating the risk premium tariff’s parameters in a non-statistical manner. However, this approach results in a pricing model that cannot be adjusted based on new claim data through classical frequentist insurance approaches. This thesis has put forth three Bayesian approaches for including estimates of an existing multiplicative tariff as the expectation of a prior in a Generalized Linear Model (GLM). The similarity between premiums set using the prior estimations and the static pricing model was measured as their relative difference. The results showed that the static tariff could be closely estimated. The estimated priors were then merged with claim data through the likelihood. These posteriors were estimated via the two Markov Chain Monte Carlo approaches, Metropolis and Metropolis-Hastings. All in all, this resulted in three risk premium models that could take advantage of existing pricing knowledge and learn over time as new cases arrived. The results showed that the Bayesian pricing methods significantly reduced the discrepancy between predicted and actual claim costs on an overall portfolio level compared to the static tariff. Nevertheless, this could not be determined on an individual policyholder level.
Stora, väletablerade försäkringsbolag modellerar sina riskpremier med hjälp av statistiska modeller och data från skadeanmälningar. Eftersom försäkringsbolagen har tillgång till en lång historik av skadeanmälningar, så kan de förutspå sina framtida skadeanmälningskostnader med hög precision. Till skillnad från ett stort försäkringsbolag, har en liten, nyetablerad försäkringsstartup inte tillgång till den mängden data. Det nyetablerade försäkringsbolagets initiala prissättningsmodell kan därför istället byggas genom att direkt estimera parametrarna i en tariff med ett icke statistiskt tillvägagångssätt. Problematiken med en sådan metod är att tariffens parametrar inte kan justerares baserat på bolagets egna skadeanmälningar med klassiska frekvensbaserade prissättningsmetoder. I denna masteruppsats presenteras tre metoder för att estimera en existerande statisk multiplikativ tariff. Estimaten kan användas som en prior i en Bayesiansk riskpremiemodell. Likheten mellan premierna som har satts via den estimerade och den faktiska statiska tariffen utvärderas genom att beräkna deras relativa skillnad. Resultaten från jämförelsen tyder på att priorn kan estimeras med hög precision. De estimerade priorparametrarna kombinerades sedan med startupbolaget Hedvigs skadedata. Posteriorn estimerades sedan med Metropolis and Metropolis-Hastings, vilket är två Markov Chain Monte Carlo simuleringsmetoder. Sammantaget resulterade detta i en prissättningsmetod som kunde utnyttja kunskap från en existerande statisk prismodell, samtidigt som den kunde ta in mer kunskap i takt med att fler skadeanmälningar skedde. Resultaten tydde på att de Bayesianska prissättningsmetoderna kunde förutspå skadekostnader med högre precision jämfört med den statiska tariffen.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
32

Amaliksen, Ingvild. „Bayesian Inversion of Time-lapse Seismic Data using Bimodal Prior Models“. Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for matematiske fag, 2014. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-24625.

Der volle Inhalt der Quelle
Annotation:
The objective of the current study is to make inference about reservoir properties from seismic reflection data. The inversion problem is cast in a Bayesian framework, and we compare and contrast three prior model settings; a Gaussian prior, a mixture Gaussian prior and a generalized Gaussian prior. A Gauss-linear likelihood model is developed and by the convenient properties of the family of Gaussian distributions, we obtain the explicit expressions for the posterior models. The posterior models define computationally efficient inversion methods that can be used to make predictions of the reservoir variables while providing an uncertainty assessment. The inversion methodologies are tested on synthetic seismic data with respect to porosity, water saturation, and change in water saturation between two time steps. The mixture Gaussian and generalized Gaussian posterior models show encouraging results under realistic signal-noise ratios.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
33

Aggarwal, Deepti. „Inferring Signal Transduction Pathways from Gene Expression Data using Prior Knowledge“. Thesis, Virginia Tech, 2015. http://hdl.handle.net/10919/56601.

Der volle Inhalt der Quelle
Annotation:
Plants have developed specific responses to external stimuli such as drought, cold, high salinity in soil, and precipitation in addition to internal developmental stimuli. These stimuli trigger signal transduction pathways in plants, leading to cellular adaptation. A signal transduction pathway is a network of entities that interact with one another in response to given stimulus. Such participating entities control and affect gene expression in response to stimulus . For computational purposes, a signal transduction pathway is represented as a network where nodes are biological molecules. The interaction of two nodes is a directed edge. A plethora of research has been conducted to understand signal transduction pathways. However, there are a limited number of approaches to explore and integrate signal transduction pathways. Therefore, we need a platform to integrate together and to expand the information of each signal transduction pathway. One of the major computational challenges in inferring signal transduction pathways is that the addition of new nodes and edges can affect the information flow between existing ones in an unknown manner. Here, I develop the Beacon inference engine to address these computational challenges. This software engine employs a network inference approach to predict new edges. First, it uses mutual information and context likelihood relatedness to predict edges from gene expression time-series data. Subsequently, it incorporates prior knowledge to limit false-positive predictions. Finally, a naive Bayes classifier is used to predict new edges. The Beacon inference engine predicts new edges with a recall rate 77.6% and precision 81.4%. 24% of the total predicted edges are new i.e., they are not present in the prior knowledge.
Master of Science
APA, Harvard, Vancouver, ISO und andere Zitierweisen
34

Rincé, Romain. „Behavior recognition on noisy data-streams constrained by complex prior knowledge“. Thesis, Nantes, 2018. http://www.theses.fr/2018NANT4085/document.

Der volle Inhalt der Quelle
Annotation:
Le traitement d’événements complexes (Complex Event Processing – CEP) consiste en l’analyse de flux de données afin den extraire des motifs et comportements particuliers décrits, en général, dans un formalisme logique. Dans l’approche classique, les données d’un flux – ou événements – sont supposées être l’observation complète et parfaite du système produisant ces événements. Cependant, dans de nombreux cas, les moyens permettant la collecte de ces données, tels que des capteurs, ne sont pas pour autant infaillibles et peuvent manquer la détection d’un événement particulier ou au contraire en produire. Dans cette thèse, nous nous sommes employé à étudier les modèles possibles de représentation de l’incertain et, ainsi, offrir au CEP une robustesse vis-à-vis de cette incertitude ainsi que les outils nécessaires pour permettre la reconnaissance de comportement complexe de façon pertinente les flux d’événements en se basant sur le formalisme des chroniques. Dans cette optique, trois approches ont été considérées. La première se base sur les réseaux logiques de Markov pour représenter la structure des chroniques sous un ensemble de formules logiques adjointe dune valeur de confiance. Nous montrons que ce modèle, bien que largement appliqué dans la littérature, est inapplicable pour une application concrète au regard des dimensions d’un tel problème. La seconde approche se basent sur des techniques issues de la communauté SAT pour énumérer l’ensemble des solutions possibles d’un problème donné et ainsi produire une valeur de confiance pour la reconnaissance dune chronique exprimée, encore une fois, sous une requête logique. Finalement, nous proposons une dernière approche basée sur les chaînes de Markov pour produire un ensemble d’échantillons expliquant l’évolution du modèle en accord avec les données observées. Ces échantillons sont ensuite analysés par en système de reconnaissance pour compter les occurrences dune chronique particulière
Complex Event Processing (CEP) consists of the analysis of data-streams in order to extract particular patterns and behaviours described, in general, in a logical formalism. In the classical approach, data of a stream – or events – are supposed to be the complete and perfect observation of the system producing these events. However, in many cases, the means for collecting such data, such as sensors, are not infallible and may miss the detection of a particular event or on the contrary produce. In this thesis, we have studied the possible models of representation of uncertainty and, thus, to offer the CEP a robustness to this uncertainty as well as the necessary tools to allow the recognition of complex behaviours based on the chronicle formalism. In this perspective, three approaches have been considered. The first one is based on Markov logical networks to represent the structure of the chronicles under a set of logical formulas of a confidence value. We show that this model, although widely applied in the literature, is inapplicable for a realistic application with regard to the dimensions of such a problem. The second approach is based on techniques from the SAT community to enumerate all possible solutions of a given problem and thus to produce a confidence value for the recognition of a chronicle expressed, again, under a logical structure. Finally, we propose a last approach based on the Markov chains to produce a set of samples explaining the evolution of the model in agreement with the observed data. These samples are then analysed by a recognition system to count the occurrences of a particular chronicle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
35

BAI, PENG. „Stochastic inversion of time domain electromagnetic data with non-trivial prior“. Doctoral thesis, Università degli Studi di Cagliari, 2022. http://hdl.handle.net/11584/328807.

Der volle Inhalt der Quelle
Annotation:
Inversion deals with inferring information about the subsurface (by reconstructing its physical properties), given: 1) observed data (usually collected at the surface) and 2) available forward modelling tools (describing physics of the used geophysical methodology). Inevitably, these forward modelling tools are always characterized by some level of approximation, and, in turn, this inaccuracy, unavoidably, affects the inversion results. This thesis presents, in particular in the context of airborne electromagnetic data, the impact and relevance of quantifying this source of (coherent) error. Specifically, a possible strategy to quantify the modelling error is discussed in the thesis. The adopted strategy for the estimation of the modelling error makes use of prior knowledge about the investigated system. The same prior knowledge is necessary in stochastic inversion frameworks. Stochastic inversion provides a natural way for 1) the assessment of the uncertainty of the final results and 2) for incorporating complex prior information into the inversion, from sources that are not the geophysical observations. Since the assessment of the modelling error is based on prior information that is also used in the stochastic inversion approaches, it is a natural choice to adopt these probabilistic strategies. By taking into account the modeling error, the stochastic inversions can eliminate or, at least, minimize, the effects of the forward approximation in the inversion results. In this thesis, through synthetic and field tests, we discuss the stochastic inversion considering the modeling error. What is called prior in the framework of stochastic inversion is assimilable to the training dataset in the context of Neural Networks: to some extent, in both cases, the final solution is by construction “stationary” with respect to the initially provided ensemble used to feed (or train) the inversion algorithm. Based also on these premises, and in the attempt to find a way to address the “definitive” problem of a fully 3D stochastic inversion, we verify the possibility of extremely efficient Neural Network strategy for the inversion of massive airborne geophysical datasets. Some preliminary, but, still, very promising results on this matter are discussed in the second last chapter of this thesis. Also in this case, the conclusions are drawn based on synthetic and experimental data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
36

Filoche, Arthur. „Variational Data Assimilation with Deep Prior. Application to Geophysical Motion Estimation“. Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS416.

Der volle Inhalt der Quelle
Annotation:
La récente résurgence de l'apprentissage profond a bouleversé l'état de l'art dans bon nombre de domaines scientifiques manipulant des données en grande dimension. En particulier, la disponibilité et la flexibilité des algorithmes ont permis d'automatiser la résolution de divers problèmes inverses, apprenant des estimateurs directement des donnés. Ce changement de paradigme n'a pas échappé à la recherche en prévision météorologique numérique. Cependant, les problématiques inhérentes aux géosciences comme l'imperfection des données et l'absence de vérité terrain compliquent l'application directe des méthodes d'apprentissage. Les algorithmes classiques d'assimilation de données, cadrant ces problèmes et permettant d'inclure des connaissances physiques, restent à l'heure actuelle les méthodes de choix dans les centres de prévision météorologique opérationnels. Dans cette thèse, nous étudions expérimentalement l'hybridation d'algorithmes combinant apprentissage profond et assimilation de données, avec pour objectif de corriger des erreurs de prévisions dues à l'incomplétude des modèles physiques ou à la méconnaissance des conditions initiales. Premièrement, nous mettons en évidence les similitudes et nuances entre assimilation de données variationnelles et apprentissage profond. Suivant l'état de l'art, nous exploitons la complémentarité des deux approches dans un algorithme itératif pour ensuite proposer une méthode d'apprentissage de bout-en-bout. Dans un second temps, nous abordons le cœur de la thèse : l'assimilation de donnés variationnelles avec a priori profond, régularisant des estimateurs classiques avec des réseaux de neurones convolutionnels. L'idée est déclinée dans différents algorithmes incluant interpolation optimale, 4DVAR avec fortes et faibles contraintes, assimilation et super-résolution ou estimation d'incertitude simultanées. Nous concluons avec des perspectives sur les hybridations proposées
The recent revival of deep learning has impacted the state of the art in many scientific fields handling high-dimensional data. In particular, the availability and flexibility of algorithms have allowed the automation of inverse problem solving, learning estimators directly from data. This paradigm shift has also reached the research field of numerical weather prediction. However, the inherent issues in geo-sciences such as imperfect data and the lack of ground truth complicate the direct application of learning methods. Classical data assimilation algorithms, framing these issues and allowing the use of physics-based constraints, are currently the methods of choice in operational weather forecasting centers. In this thesis, we experimentally study the hybridization of deep learning and data assimilation algorithms, with the objective of correcting forecast errors due to incomplete physical models or uncertain initial conditions. First, we highlight the similarities and nuances between variational data assimilation and deep learning. Following the state of the art, we exploit the complementarity of the two approaches in an iterative algorithm to then propose an end-to-end learning method. In a second part, we address the core of the thesis: variational data assimilation with deep prior, regularizing classical estimators with convolutional neural networks. The idea is declined in various algorithms including optimal interpolation, 4DVAR with strong and weak constraints, simultaneous assimilation, and super-resolution or uncertainty estimation. We conclude with perspectives on the proposed hybridization
APA, Harvard, Vancouver, ISO und andere Zitierweisen
37

Zhang, Xiang. „Analysis of Spatial Data“. UKnowledge, 2013. http://uknowledge.uky.edu/statistics_etds/4.

Der volle Inhalt der Quelle
Annotation:
In many areas of the agriculture, biological, physical and social sciences, spatial lattice data are becoming increasingly common. In addition, a large amount of lattice data shows not only visible spatial pattern but also temporal pattern (see, Zhu et al. 2005). An interesting problem is to develop a model to systematically model the relationship between the response variable and possible explanatory variable, while accounting for space and time effect simultaneously. Spatial-temporal linear model and the corresponding likelihood-based statistical inference are important tools for the analysis of spatial-temporal lattice data. We propose a general asymptotic framework for spatial-temporal linear models and investigate the property of maximum likelihood estimates under such framework. Mild regularity conditions on the spatial-temporal weight matrices will be put in order to derive the asymptotic properties (consistency and asymptotic normality) of maximum likelihood estimates. A simulation study is conducted to examine the finite-sample properties of the maximum likelihood estimates. For spatial data, aside from traditional likelihood-based method, a variety of literature has discussed Bayesian approach to estimate the correlation (auto-covariance function) among spatial data, especially Zheng et al. (2010) proposed a nonparametric Bayesian approach to estimate a spectral density. We will also discuss nonparametric Bayesian approach in analyzing spatial data. We will propose a general procedure for constructing a multivariate Feller prior and establish its theoretical property as a nonparametric prior. A blocked Gibbs sampling algorithm is also proposed for computation since the posterior distribution is analytically manageable.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
38

Ching, Kai-Sang. „Priority CSMA schemes for integrated voice and data transmission“. Thesis, University of British Columbia, 1988. http://hdl.handle.net/2429/28372.

Der volle Inhalt der Quelle
Annotation:
Priority schemes employing the inherent properties of carrier-sense multiple-access (CSMA) schemes are investigated and then applied to the integrated transmission of voice and data. A priority scheme composed of 1-persistent and non-persistent CSMA protocols is proposed. The throughput and delay characteristics of this protocol are evaluated by mathematical analysis and simulation, respectively. The approach of throughput analysis is further extended to another more general case, p-persistent CSMA with two persistency factors, the throughput performance of which had not been analyzed before. Simulations are carried out to study the delay characteristics of this protocol. After careful consideration of the features of the priority schemes studied, two protocols are proposed for integrated voice and data transmission. While their ultimate purpose is for integrated services, they have different application. One of them is applied to local area network; the other is suitable for packet radio network. The distinctive features of the former are simplicity and flexibility. The latter is different from other studies in that collision detection is not required, and that it has small mean and variance of voice packet delay. Performance characteristics of both of these protocols are examined by simulations under various system parameter values.
Applied Science, Faculty of
Electrical and Computer Engineering, Department of
Graduate
APA, Harvard, Vancouver, ISO und andere Zitierweisen
39

Walter, Gero [Verfasser], und Thomas [Akademischer Betreuer] Augustin. „Generalized Bayesian inference under prior-data conflict / Gero Walter. Betreuer: Thomas Augustin“. München : Universitätsbibliothek der Ludwig-Maximilians-Universität, 2013. http://d-nb.info/1052779247/34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
40

Wahlqvist, Kristian. „A Comparison of Motion Priors for EKF-SLAM in Autonomous Race Cars“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262673.

Der volle Inhalt der Quelle
Annotation:
Simultaneous Localization and Mapping (SLAM) is one of the fundamental problems to solve for any autonomous vehicle or robot. The SLAM problem is for an agent to incrementally build a map of its surroundings while keeping track of its location within the map, using various sensors. The goal of this thesis is to demonstrate the differences and limitations of different odometry methods when used as the prior motion estimate for the common SLAM algorithm EKF-SLAM. Inspired by autonomous racing, the algorithms were evaluated for especially difficult driving scenarios such as high velocities and while the car was skidding. Three different odometry algorithms that all rely on different sensors were implemented; The feature based stereo visual odometry algorithm Libviso2, the lidar odometry algorithm Range Flow-based 2D Odometry (RF2O), and wheeled odometry fused with measurements of the vehicles angular velocity from a gyroscope. The different algorithms were evaluated separately on real data that was gathered by running a modified RC car equipped with the necessary sensors around different racing track configurations. The car was driven for different levels of aggressiveness, where more aggressive driving implies a higher velocity and skidding. The SLAM estimate of the vehicle position and cone locations were evaluated in terms of mean absolute error (MAE) and computational time, for all motion priors separately on each track. The results show that Libviso2 provides an accurate prior motion estimate with consistent performance over all test cases. RF2O and the wheeled odometry approach could in some of the cases provide a prior motion estimate that was sufficient for accurate SLAM performance, but performed poorly for other cases.
Simultan lokalisering och kartläggning (SLAM) är ett grundläggande problem att lösa för alla typer av autonoma fordon eller robotar. SLAM handlar om problemet för en agent att inkrementellt konstruera en karta av sin omgivning samtidigt som den håller koll på sin position inom kartan, med hjälp av diverse sensorer. Målet med detta examensarbete är att demonstrera skillnader och begränsningar för olika metoder att uppskatta bilens momentana förflyttning, när denna momentana förflyttning används som en fösta skattning av bilens rörelse för SLAM-algoritmen EKF-SLAM. Utvärderingen grundar sig i autonom motorsport och de undersökta algoritmerna utvärderades under speciellt svåra förhållanden så som hög hastighet och när bilen sladdar. Tre olika metoder för att skatta bilens momentana förflyttning implementerades där samtliga metoder baseras på data från olika sensorer. Dessa var den visuella metoden Libviso2 som använder stereokameror, den flödesbaserade metoden RF2O som använder en 2D lidar, samt en metod som baseras på hjulens rotationshastighet som kombinerades med fordonets uppmätta vinkelhastighet från ett gyroskop. De olika algoritmerna utvärderades separat på data som genererats genom att köra en modifierad radiostyrd bil runt olika banor utmarkerade av trafikkoner, samt för olika nivåer av aggressiv körstil. Estimeringen av bilens bana och konernas positioner jämfördes sedan separat i termer av medelabsolutfel samt beräkningstid för varje metod och bana. Resultaten visar att Libviso2 ger en bra skattning av den momentana förflyttningen och presterar konsekvent över samtliga tester. RF2O och metoden baserad på hjulens rotationshastighet var i vissa fall tillräckligt bra för korrekt lokalisering och kartläggning, men presterade dåligt i andra fall.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
41

Abufadel, Amer Y. „4D Segmentation of Cardiac MRI Data Using Active Surfaces with Spatiotemporal Shape Priors“. Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14005.

Der volle Inhalt der Quelle
Annotation:
This dissertation presents a fully automatic segmentation algorithm for cardiac MR data. Some of the currently published methods are automatic, but they only work well in 2D and sometimes in 3D and do not perform well near the extremities (apex and base) of the heart. Additionally, they require substantial user input to make them feasible for use in a clinical environment. This dissertation introduces novel approaches to improve the accuracy, robustness, and consistency of existing methods. Segmentation accuracy can be improved by knowing as much about the data as possible. Accordingly, we compute a single 4D active surface that performs segmentation in space and time simultaneously. The segmentation routine can now take advantage of information from neighboring pixels that can be adjacent either spatially or temporally. Robustness is improved further by using confidence labels on shape priors. Shape priors are deduced from manual segmentation of training data. This data may contain imperfections that may impede proper manual segmentation. Confidence labels indicate the level of fidelity of the manual segmentation to the actual data. The contribution of regions with low confidence levels can be attenuated or excluded from the final result. The specific advantages of using the 4D segmentation along with shape priors and regions of confidence are highlighted throughout the thesis dissertation. Performance of the new method is measured by comparing the results to traditional 3D segmentation and to manual segmentation performed by a trained clinician.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
42

Sonksen, Michael David. „Bayesian Model Diagnostics and Reference Priors for Constrained Rate Models of Count Data“. The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1312909127.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
43

Grün, Bettina, und Gertraud Malsiner-Walli. „Bayesian Latent Class Analysis with Shrinkage Priors: An Application to the Hungarian Heart Disease Data“. FedOA -- Federico II University Press, 2018. http://epub.wu.ac.at/6612/1/heart.pdf.

Der volle Inhalt der Quelle
Annotation:
Latent class analysis explains dependency structures in multivariate categorical data by assuming the presence of latent classes. We investigate the specification of suitable priors for the Bayesian latent class model to determine the number of classes and perform variable selection. Estimation is possible using standard tools implementing general purpose Markov chain Monte Carlo sampling techniques such as the software JAGS. However, class specific inference requires suitable post-processing in order to eliminate label switching. The proposed Bayesian specification and analysis method is applied to the Hungarian heart disease data set to determine the number of classes and identify relevant variables and results are compared to those obtained with the standard prior for the component specific parameters.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
44

Nagy, Arnold B. „Priority area performance and planning areas with limited biological data“. Thesis, University of Sheffield, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.425193.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
45

Ostovari, Pouya. „Priority-Based Data Transmission in Wireless Networks using Network Coding“. Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/360800.

Der volle Inhalt der Quelle
Annotation:
Computer and Information Science
Ph.D.
With the rapid development of mobile devices technology, they are becoming very popular and a part of our everyday lives. These devices, which are equipped with wireless radios, such as cellular and WiFi radios, affect almost every aspect of our lives. People use smartphone and tablets to access the Internet, watch videos, chat with their friends, and etc. The wireless connections that these devices provide is more convenient than the wired connections. However, there are two main challenges in wireless networks: error-prone wireless links and network resources limitation. Network coding is widely used to provide reliable data transmission and to use the network resources efficiently. Network coding is a technique in which the original packets are mixed together using algebraic operations. In this dissertation, we study the applications of network coding in making the wireless transmissions robust against transmission errors and in efficient resource management. In many types of data, the importance of different parts of the data are different. For instance, in the case of numeric data, the importance of the data decreases from the most significant to the least significant bit. Also, in multi-layer videos, the importance of the packets in different layers of the videos are not the same. We propose novel data transmission methods in wireless networks that considers the unequal importance of the different parts of the data. In order to provide robust data transmissions and use the limited resources efficiently, we use random linear network coding technique, which is a type of network coding. In the first part of this dissertation, we study the application of network coding in resource management. In order to use the the limited storage of cache nodes efficiently, we propose to use triangular network coding for content distribution. We also design a scalable video-on-demand system, which uses helper nodes and network coding to provide users with their desired video quality. In the second part, we investigate the application of network coding in providing robust wireless transmissions. We propose symbol-level network coding, in which each packet is partitioned to symbols with different importance. We also propose a method that uses network coding to make multi-layer videos robust against transmission errors.
Temple University--Theses
APA, Harvard, Vancouver, ISO und andere Zitierweisen
46

Li, Bin. „Statistical learning and predictive modeling in data mining“. Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1155058111.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
47

Crabb, Ryan Eugene. „Fast Time-of-Flight Phase Unwrapping and Scene Segmentation Using Data Driven Scene Priors“. Thesis, University of California, Santa Cruz, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=3746704.

Der volle Inhalt der Quelle
Annotation:

This thesis regards the method of full field time-of-flight depth imaging by way of amplitude modulated continuous wave signals correlated with step-shifted reference waveforms using a specialized solid state CMOS sensor, referred to as photonic mixing device. The specific focus deals with the inherent issue of depth ambiguity due to a fundamental property of periodic signals: that they repeat, or wrap, after each period, and any signal shifted by a whole number of wavelengths is indistinguishable from the original. Recovering the full extent of the signal’s path is known as phase unwrapping. The common, accepted solution requires the imaging of a series of two or more signals with differing modulation frequencies to resolve the ambiguity, the time delay of which will result in erroneous or invalid measurements for non-static elements of the scene. This work details a physical model of the observable illumination of the scene which provides priors for a novel probabilistic framework to recover the scene geometry by imaging only a single modulated signal. It is demonstrated that this process is able to provide more than adequate results in a majority of representative scenes, and that it can be accomplished on typical computer hardware at a speed that allows for the range imaging to be utilized in real-time, interactive applications.

One such real-time application is presented: alpha-matting, or foreground segmentation, for background substitution of live video. This is a generalized version of the common technique of green-screening that is utilized, for example, by every local weather reporter. The presented method, however, requires no special background, and is able to perform on high resolution video from a lower resolution depth image.

APA, Harvard, Vancouver, ISO und andere Zitierweisen
48

Tarca, Adi-Laurentiu. „Neural networks in multiphase reactors data mining: feature selection, prior knowledge, and model design“. Thesis, Université Laval, 2004. http://www.theses.ulaval.ca/2004/21673/21673.pdf.

Der volle Inhalt der Quelle
Annotation:
Les réseaux de neurones artificiels (RNA) suscitent toujours un vif intérêt dans la plupart des domaines d’ingénierie non seulement pour leur attirante « capacité d’apprentissage » mais aussi pour leur flexibilité et leur bonne performance, par rapport aux approches classiques. Les RNA sont capables «d’approximer» des relations complexes et non linéaires entre un vecteur de variables d’entrées x et une sortie y. Dans le contexte des réacteurs multiphasiques le potentiel des RNA est élevé car la modélisation via la résolution des équations d’écoulement est presque impossible pour les systèmes gaz-liquide-solide. L’utilisation des RNA dans les approches de régression et de classification rencontre cependant certaines difficultés. Un premier problème, général à tous les types de modélisation empirique, est celui de la sélection des variables explicatives qui consiste à décider quel sous-ensemble xs ⊂ x des variables indépendantes doit être retenu pour former les entrées du modèle. Les autres difficultés à surmonter, plus spécifiques aux RNA, sont : le sur-apprentissage, l’ambiguïté dans l’identification de l’architecture et des paramètres des RNA et le manque de compréhension phénoménologique du modèle résultant. Ce travail se concentre principalement sur trois problématiques dans l’utilisation des RNA: i) la sélection des variables, ii) l’utilisation de la connaissance apriori, et iii) le design du modèle. La sélection des variables, dans le contexte de la régression avec des groupes adimensionnels, a été menée avec les algorithmes génétiques. Dans le contexte de la classification, cette sélection a été faite avec des méthodes séquentielles. Les types de connaissance a priori que nous avons insérés dans le processus de construction des RNA sont : i) la monotonie et la concavité pour la régression, ii) la connectivité des classes et des coûts non égaux associés aux différentes erreurs, pour la classification. Les méthodologies développées dans ce travail ont permis de construire plusieurs modèles neuronaux fiables pour les prédictions de la rétention liquide et de la perte de charge dans les colonnes garnies à contre-courant ainsi que pour la prédiction des régimes d’écoulement dans les colonnes garnies à co-courant.
Artificial neural networks (ANN) have recently gained enormous popularity in many engineering fields, not only for their appealing “learning ability,” but also for their versatility and superior performance with respect to classical approaches. Without supposing a particular equational form, ANNs mimic complex nonlinear relationships that might exist between an input feature vector x and a dependent (output) variable y. In the context of multiphase reactors the potential of neural networks is high as the modeling by resolution of first principle equations to forecast sought key hydrodynamics and transfer characteristics is intractable. The general-purpose applicability of neural networks in regression and classification, however, poses some subsidiary difficulties that can make their use inappropriate for certain modeling problems. Some of these problems are general to any empirical modeling technique, including the feature selection step, in which one has to decide which subset xs ⊂ x should constitute the inputs (regressors) of the model. Other weaknesses specific to the neural networks are overfitting, model design ambiguity (architecture and parameters identification), and the lack of interpretability of resulting models. This work addresses three issues in the application of neural networks: i) feature selection ii) prior knowledge matching within the models (to answer to some extent the overfitting and interpretability issues), and iii) the model design. Feature selection was conducted with genetic algorithms (yet another companion from artificial intelligence area), which allowed identification of good combinations of dimensionless inputs to use in regression ANNs, or with sequential methods in a classification context. The type of a priori knowledge we wanted the resulting ANN models to match was the monotonicity and/or concavity in regression or class connectivity and different misclassification costs in classification. Even the purpose of the study was rather methodological; some resulting ANN models might be considered contributions per se. These models-- direct proofs for the underlying methodologies-- are useful for predicting liquid hold-up and pressure drop in counter-current packed beds and flow regime type in trickle beds.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
49

Mehdi, Riyadh Abdul Kadir. „An investigation of object recognition using spatial data and the concept of prior expectation“. Thesis, University of Liverpool, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.291671.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
50

Jeanmougin, Marine. „Statistical methods for robust analysis of transcriptome data by integration of biological prior knowledge“. Thesis, Evry-Val d'Essonne, 2012. http://www.theses.fr/2012EVRY0029/document.

Der volle Inhalt der Quelle
Annotation:
Au cours de la dernière décennie, les progrès en Biologie Moléculaire ont accéléré le développement de techniques d'investigation à haut-débit. En particulier, l'étude du transcriptome a permis des avancées majeures dans la recherche médicale. Dans cette thèse, nous nous intéressons au développement de méthodes statistiques dédiées au traitement et à l'analyse de données transcriptomiques à grande échelle. Nous abordons le problème de sélection de signatures de gènes à partir de méthodes d'analyse de l'expression différentielle et proposons une étude de comparaison de différentes approches, basée sur plusieurs stratégies de simulations et sur des données réelles. Afin de pallier les limites de ces méthodes classiques qui s'avèrent peu reproductibles, nous présentons un nouvel outil, DiAMS (DIsease Associated Modules Selection), dédié à la sélection de modules de gènes significatifs. DiAMS repose sur une extension du score-local et permet l'intégration de données d'expressions et de données d'interactions protéiques. Par la suite, nous nous intéressons au problème d'inférence de réseaux de régulation de gènes. Nous proposons une méthode de reconstruction à partir de modèles graphiques Gaussiens, basée sur l'introduction d'a priori biologique sur la structure des réseaux. Cette approche nous permet d'étudier les interactions entre gènes et d'identifier des altérations dans les mécanismes de régulation, qui peuvent conduire à l'apparition ou à la progression d'une maladie. Enfin l'ensemble de ces développements méthodologiques sont intégrés dans un pipeline d'analyse que nous appliquons à l'étude de la rechute métastatique dans le cancer du sein
Recent advances in Molecular Biology have led biologists toward high-throughput genomic studies. In particular, the investigation of the human transcriptome offers unprecedented opportunities for understanding cellular and disease mechanisms. In this PhD, we put our focus on providing robust statistical methods dedicated to the treatment and the analysis of high-throughput transcriptome data. We discuss the differential analysis approaches available in the literature for identifying genes associated with a phenotype of interest and propose a comparison study. We provide practical recommendations on the appropriate method to be used based on various simulation models and real datasets. With the eventual goal of overcoming the inherent instability of differential analysis strategies, we have developed an innovative approach called DiAMS, for DIsease Associated Modules Selection. This method was applied to select significant modules of genes rather than individual genes and involves the integration of both transcriptome and protein interactions data in a local-score strategy. We then focus on the development of a framework to infer gene regulatory networks by integration of a biological informative prior over network structures using Gaussian graphical models. This approach offers the possibility of exploring the molecular relationships between genes, leading to the identification of altered regulations potentially involved in disease processes. Finally, we apply our statistical developments to study the metastatic relapse of breast cancer
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie