Rozprawy doktorskie na temat „Données de préférence”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 32 najlepszych rozpraw doktorskich naukowych na temat „Données de préférence”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Alami, Karim. "Optimisation des requêtes de préférence skyline dans des contextes dynamiques". Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0135.
Pełny tekst źródłaPreference queries are interesting tools to compute small representatives of datasets or to rank tuples based on the users’ preferences. In this thesis, we mainly focus on the optimization of Skyline queries, a special class of preference queries, in dynamic contexts. In a first part, we address the incremental maintenance of the multidimensional indexing structure NSC which has been shown efficient for answering skyline queries in a static context. More precisely, we address (i) the case of dynamic data, i.e. tuples are inserted or deleted at any time, and (ii) the case of streaming data, i.e. tuples are appended only, and discarded after a specific interval of time. In case of dynamic data, we redesign the structure and propose procedures to handle efficiently both insertions and deletions. In case of streaming data, we propose MSSD a data pipeline which operates in batch mode, and maintains NSCt a variation of NSC. In a second part, we address the case of dynamic orders, i.e, some or all attributes of the dataset are nominal and each user expresses his/her own partial order on these attributes’ domain. We propose highly scalable parallel algorithms that decompose an issued query into a set of sub-queries and process each sub-query independently. In a further step for optimization, we propose the partial materialization of sub-queries and introduce the problem of cost-driven sub-queries selection
Ben, Hassine Soumaya. "Évaluation et requêtage de données multisources : une approche guidée par la préférence et la qualité des données : application aux campagnes marketing B2B dans les bases de données de prospection". Thesis, Lyon 2, 2014. http://www.theses.fr/2014LYO22012/document.
Pełny tekst źródłaIn Business-to-Business (B-to-B) marketing campaigns, manufacturing “the highest volume of sales at the lowest cost” and achieving the best return on investment (ROI) score is a significant challenge. ROI performance depends on a set of subjective and objective factors such as dialogue strategy, invested budget, marketing technology and organisation, and above all data and, particularly, data quality. However, data issues in marketing databases are overwhelming, leading to insufficient target knowledge that handicaps B-to-B salespersons when interacting with prospects. B-to-B prospection data is indeed mainly structured through a set of independent, heterogeneous, separate and sometimes overlapping files that form a messy multisource prospect selection environment. Data quality thus appears as a crucial issue when dealing with prospection databases. Moreover, beyond data quality, the ROI metric mainly depends on campaigns costs. Given the vagueness of (direct and indirect) cost definition, we limit our focus to price considerations.Price and quality thus define the fundamental constraints data marketers consider when designing a marketing campaign file, as they typically look for the "best-qualified selection at the lowest price". However, this goal is not always reachable and compromises often have to be defined. Compromise must first be modelled and formalized, and then deployed for multisource selection issues. In this thesis, we propose a preference-driven selection approach for multisource environments that aims at: 1) modelling and quantifying decision makers’ preferences, and 2) defining and optimizing a selection routine based on these preferences. Concretely, we first deal with the data marketer’s quality preference modelling by appraising multisource data using robust evaluation criteria (quality dimensions) that are rigorously summarized into a global quality score. Based on this global quality score and data price, we exploit in a second step a preference-based selection algorithm to return "the best qualified records bearing the lowest possible price". An optimisation algorithm, BrokerACO, is finally run to generate the best selection result
Jerbi, Houssem. "Personnalisation d'analyses décisionnelles sur des données multidimensionnelles". Phd thesis, Toulouse 1, 2012. http://tel.archives-ouvertes.fr/tel-00695371.
Pełny tekst źródłaChouiref, Zahira. "Contribution à l'étude de l'optimisation de requêtes de services Web : une approche centrée utilisateur". Thesis, Chasseneuil-du-Poitou, Ecole nationale supérieure de mécanique et d'aérotechnique, 2017. http://www.theses.fr/2017ESMA0016.
Pełny tekst źródłaThe internet has completely transformed the way how we communicate (access toinformation). Its evolution was marked by strong growth of published services which has been accompanied by a large explosion in the number of users and a diversity oftheir profiles and their contexts.The work presented in this thesis deal with the adaptive optimization of Web services queries to user needs. This problem is to select a service or a combination of relevant services from a collection of candidates able to perform a required task. These candidate services must meet the requirements requested by the user, the selection makes his/herchoice from non-functional criteria. In our approach, non-functional criteria considered are all associated with preferences of service requester. A significant interest is therefore carried to the user who is at the core of the selection system. This selection is generally considered a complex task because of the diversity of profile and context of the service,which it is performed.Our study focuses mainly on the analysis of different service selection approaches.We especially highlight their contribution to solving the problems inherent in selecting the best services in order to meet the non-functional parameters of the request. Second, our interest has focused on modeling the specification of supply and demand for services, their context and profile as well as the two families preferences : explicit and implicit. Finally, we propose a novel optimization approach that integrates a query reformulation strategy by introducing implicit preferences based on the fuzzy inference process. The idea is to combine the two families of preferences required by the user with consideration of profiles and contexts of services and the user simultaneously. The application of fuzzy set theory in the optimization of preference query of customers by integrating reasoning module on information related to the user leads of great interest in improving the quality of results. We present at the end a set of experiments to demonstrate the validity and relevance of the proposed approach
Marie, Damien. "Anatomie du gyrus de Heschl et spécialisation hémisphérique : étude d'une base de données de 430 sujets témoins volontaire sains". Thesis, Bordeaux 2, 2013. http://www.theses.fr/2013BOR22072/document.
Pełny tekst źródłaThis thesis concerns the macroscopical anatomy of Heschl’s gyri (HG) in relation with Manual Preference (MP) and the Hemispheric Specialization (HS) for language studied in a multimodal database dedicated to the investigation of HS and balanced for sex and MP (BIL&GIN). HG, located on the surface of the temporal lobe, hosts the primary auditory cortex. Previous studies have shown that HG volume is leftward asymmetrical and that the left HG (LHG) covaries with phonological performance and with the amount of cortex dedicated to the processing of the temporal aspects of sounds, suggesting a relationship between LHG and HSL. However HG anatomy is highly variable and little known. In this thesis we have: 1- Described HG inter-hemispheric gyrification pattern on the anatomical MRI images of 430 healthy participants. 2- Studied the variation of the first or anterior HG (aHG) surface area and its asymmetry and shown its reduction in the presence of duplication and that its leftward asymmetry was present only in the case of a single LHG. Left-handers exhibited a lower incidence of right duplication and a loss of aHG leftward asymmetry. 3- Tested whether the variance of HG anatomy explained the interindividual variability of asymmetries measured with fMRI during the listening of a list of words in 281 participants, and whether differences in HG anatomy with MP were related to decreased HS for language in left-handers. HG inter-hemispheric gyrification pattern explained 11% of the variance of HG functional asymmetry, the patterns including a unique LHG being those with the strongest leftward asymmetry. There was no incidence of MP on HG functional lateralization
Elmi, Saïda. "An Advanced Skyline Approach for Imperfect Data Exploitation and Analysis". Thesis, Chasseneuil-du-Poitou, Ecole nationale supérieure de mécanique et d'aérotechnique, 2017. http://www.theses.fr/2017ESMA0011/document.
Pełny tekst źródłaThe main purpose of this thesis is to study an advanced database tool named the skyline operator in the context of imperfect data modeled by the evidence theory. In this thesis, we first address, on the one hand, the fundamental question of how to extend the dominance relationship to evidential data, and on the other hand, it provides some optimization techniques for improving the efficiency of the evidential skyline. We then introduce efficient approach for querying and processing the evidential skyline over multiple and distributed servers. ln addition, we propose efficient methods to maintain the skyline results in the evidential database context wben a set of objects is inserted or deleted. The idea is to incrementally compute the new skyline, without reconducting an initial operation from the scratch. In the second step, we introduce the top-k skyline query over imperfect data and we develop efficient algorithms its computation. Further more, since the evidential skyline size is often too large to be analyzed, we define the set SKY² to refine the evidential skyline and retrieve the best evidential skyline objects (or the stars). In addition, we develop suitable algorithms based on scalable techniques to efficiently compute the evidential SKY². Extensive experiments were conducted to show the efficiency and the effectiveness of our approaches
Goibert, Morgane. "Statistical Understanding of Adversarial Robustness". Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAT052.
Pełny tekst źródłaThis thesis focuses on the question of robustness in machine learning, specifically examining two types of attacks: poisoning attacks at training time and evasion attacks at inference time.The study of poisoning attacks dates back to the sixties and has been unified under the theory of robust statistics. However, prior research was primarily focused on classical data types, mainly real-numbered data, limiting the applicability of poisoning attack studies. In this thesis, robust statistics are extended to ranking data, which lack a vector space structure and have a combinatorial nature. The work presented in this thesis initiates the study of robustness in the context of ranking data and provides a framework for future extensions. Contributions include a practical algorithm to measure the robustness of statistics for the task of consensus ranking, and two robust statistics to solve this task.In contrast, since 2013, evasion attacks gained significant attention in the deep learning field, particularly for image classification. Despite the proliferation of research works on adversarial examples, the theoretical analysis of the problem remains challenging and it lacks unification. To address this matter, the thesis makes contributions to understanding and mitigating evasion attacks. These contributions involve the unification of adversarial examples' characteristics through the study of under-optimized edges and information flow within neural networks, and the establishment of theoretical bounds characterizing the success rate of modern low-dimensional attacks for a wide range of models
Labernia, Fabien. "Algorithmes efficaces pour l’apprentissage de réseaux de préférences conditionnelles à partir de données bruitées". Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLED018/document.
Pełny tekst źródłaThe rapid growth of personal web data has motivated the emergence of learning algorithms well suited to capture users’ preferences. Among preference representation formalisms, conditional preference networks (CP-nets) have proven to be effective due to their compact and explainable structure. However, their learning is difficult due to their combinatorial nature.In this thesis, we tackle the problem of learning CP-nets from corrupted large datasets. Three new algorithms are introduced and studied on both synthetic and real datasets.The first algorithm is based on query learning and considers the contradictions between multiple users’ preferences by searching in a principled way the variables that affect the preferences. The second algorithm relies on information-theoretic measures defined over the induced preference rules, which allow us to deal with corrupted data. An online version of this algorithm is also provided, by exploiting the McDiarmid's bound to define an asymptotically optimal decision criterion for selecting the best conditioned variable and hence allowing to deal with possibly infinite data streams
Sibony, Eric. "Analyse mustirésolution de données de classements". Thesis, Paris, ENST, 2016. http://www.theses.fr/2016ENST0036/document.
Pełny tekst źródłaThis thesis introduces a multiresolution analysis framework for ranking data. Initiated in the 18th century in the context of elections, the analysis of ranking data has attracted a major interest in many fields of the scientific literature : psychometry, statistics, economics, operations research, machine learning or computational social choice among others. It has been even more revitalized by modern applications such as recommender systems, where the goal is to infer users preferences in order to make them the best personalized suggestions. In these settings, users express their preferences only on small and varying subsets of a large catalog of items. The analysis of such incomplete rankings poses however both a great statistical and computational challenge, leading industrial actors to use methods that only exploit a fraction of available information. This thesis introduces a new representation for the data, which by construction overcomes the two aforementioned challenges. Though it relies on results from combinatorics and algebraic topology, it shares several analogies with multiresolution analysis, offering a natural and efficient framework for the analysis of incomplete rankings. As it does not involve any assumption on the data, it already leads to overperforming estimators in small-scale settings and can be combined with many regularization procedures for large-scale settings. For all those reasons, we believe that this multiresolution representation paves the way for a wide range of future developments and applications
Jolivet, Laurence. "Modélisation des déplacements d'animaux dans un espace géographique : analyse et simulation". Thesis, Paris 1, 2014. http://www.theses.fr/2014PA010524/document.
Pełny tekst źródłaFinding compromises between human development and wildlife protection is one concern of society.Taking into account animal movements in planning projects requires some knowledge on species behaviours and on what determines their localizations and their habitat places. Our goal is to be able to represent animal movements on an accurate geographical space in order to simulate and to evaluate the consequences of planning decisions. We first analysed how the features of the landscape influence movements from collected localizations on animals, for example GPS tracks (studies of ELIZ, ANSES, ONCFS, INRA) and from data describing spacesuch as BD TOPO®. The studied cases are about several types of environment and three species: red fox,roe deer and red deer. We found some results that confirm the role played by the spatial features,depending on the studied cases. For instance in a periurban environment, foxes seem to be more inwooded patches and in places with few human activities during some parts of the day (squares, areas with industrial or commercial activities, sides of railways). In a forested environment, deers are more likely to be influenced by slope and forest stands. Thanks to knowledge from data analyses and to literature, we defined a simulation model for animalmovements. We implemented it in the GeOxygene platform. The trajectories are built with an agent approach by taking into account the spatial behaviour of the species and the influence of elements that favour or hinder movements. We proposed a critical view of the modelling choices and some improvements from the comparison with observations and experts advices. Then, scenarios within frastructures are defined so that to identify their impact and their efficiency
Delporte, Julien. "Factorisation matricielle, application à la recommandation personnalisée de préférences". Phd thesis, INSA de Rouen, 2014. http://tel.archives-ouvertes.fr/tel-01005223.
Pełny tekst źródłaSadoun, Isma. "Raffinement progressif et personnalisé des requêtes de préférences dans un espace hautement dimensionnel". Versailles-St Quentin en Yvelines, 2014. http://www.theses.fr/2014VERS0004.
Pełny tekst źródłaThe use of preferences provides personalized the multi-criteria search and enhances the relevance of the result. The most prominent technique is the skyline queries, based on the concept of Pareto dominance defined. These queries can eliminate tuples dominated by other tuples. The user can then choose from the tuples that are not dominated , which can be considered as the best choice. However, one of the main limitations of skyline queries is when the number of dimensions increases, the result size becomes too large to offer any interesting insights. This thesis provides different solutions to this problem. The general idea is to extend the dominance relationships by introducing more flexible and individualized criteria for comparing tuples, then combine them gradually to best meet the needs of the user. Extensions were made to the skyline operator to offer the user the ability to classify tuples to choose the best or select k best solutions. The user can successively use several preference relations by ordering them to take into account the priorities and level of reliability he attributes to each. This thesis also describes the proposed algorithms, along with the to validate our approaches
Sibony, Eric. "Analyse mustirésolution de données de classements". Electronic Thesis or Diss., Paris, ENST, 2016. http://www.theses.fr/2016ENST0036.
Pełny tekst źródłaThis thesis introduces a multiresolution analysis framework for ranking data. Initiated in the 18th century in the context of elections, the analysis of ranking data has attracted a major interest in many fields of the scientific literature : psychometry, statistics, economics, operations research, machine learning or computational social choice among others. It has been even more revitalized by modern applications such as recommender systems, where the goal is to infer users preferences in order to make them the best personalized suggestions. In these settings, users express their preferences only on small and varying subsets of a large catalog of items. The analysis of such incomplete rankings poses however both a great statistical and computational challenge, leading industrial actors to use methods that only exploit a fraction of available information. This thesis introduces a new representation for the data, which by construction overcomes the two aforementioned challenges. Though it relies on results from combinatorics and algebraic topology, it shares several analogies with multiresolution analysis, offering a natural and efficient framework for the analysis of incomplete rankings. As it does not involve any assumption on the data, it already leads to overperforming estimators in small-scale settings and can be combined with many regularization procedures for large-scale settings. For all those reasons, we believe that this multiresolution representation paves the way for a wide range of future developments and applications
Bouker, Slim. "Contribution à l'extraction des règles d'association basée sur des préférences". Thesis, Clermont-Ferrand 2, 2015. http://www.theses.fr/2015CLF22585/document.
Pełny tekst źródłaTran, Nguyen Minh-Thu. "Abstraction et règles d'association pour l'amélioration des systèmes de recommandation à partir de données de préférences binaires". Paris 13, 2011. http://www.theses.fr/2011PA132016.
Pełny tekst źródłaIn recent years, recommendation systems have been extensively explored in order to help the user facing the increasing information on Internet. Those systems are used in e-commerce (Amazon, eBay, Netflix. . . ), entertainment, online news, etc. In the domain of e-commerce, the available data is often difficult to exploit to build robust recommendations : binary data, long tail of the distribution of preferences and everlasting adding or removing of items. In fact, most recommender systems focus on the most popular items because the new items or those of the "long tail" are associated with little or no preference. To improve the performance of these systems, we propose to search for association rules between abstracted items. First, the abstraction of the items can lead to a considerable reduction of the long tail effect. Second, the extraction of abstract association rules can be used to identify items to be recommended. . Two algorithms are introduced : AbsTopk, based on the rules in the space of abstract and ACReco combining items in the space of abstract and concrete items by pair. These algorithms were evaluated quantitatively (relevance) and qualitatively (novelty and diversity) on a real database of an online e-commerce site. The empirical results presented show the interest of the proposed approach
Hebert, Pierre-Alexandre. "Analyse de données sensorielles : une approche ordinale floue". Compiègne, 2004. http://www.theses.fr/2004COMP1542.
Pełny tekst źródłaSensory profile data aims at describing the sensory perceptions of human subjects. Such a data is composed of scores attributed by human sensory experts (or judges) in order to describe a set of products according to sensory descriptors. AlI assessments are repeated, usually three times. The thesis describes a new analysis method based on a fuzzy modelling of the scores. The first step of the method consists in extracting and encoding the relevant information of each replicate into a fuzzy weak dominance relation. Then an aggregation procedure over the replicates allows to synthesize the perception of each judge into a new fuzzy relation. Ln a similar way, a consensual relation is finally obtained for each descriptor by fusing the relations of the judges. So as to ensure the interpretation of fused relations, fuzzy preference theory is used. A set of graphical tools is then proposed for the mono and multidimensional analysis of the obtained relations
Mukhtar, Hamid. "Intergiciel pour la composition des tâches utilisateurs dans les environnements pervasifs étant donné les préférences utilisateurs". Phd thesis, Institut National des Télécommunications, 2009. http://tel.archives-ouvertes.fr/tel-00537308.
Pełny tekst źródłaDiallo, Mouhamadou Saliou. "Découverte de règles de préférences contextuelles : application à la construction de profils utilisateurs". Thesis, Tours, 2015. http://www.theses.fr/2015TOUR4052/document.
Pełny tekst źródłaThe use of preferences arouses a growing interest to personalize response to requests and making targeted recommandations. Nevertheless, manual construction of preferences profiles remains complex and time-consuming. In this context, we present in this thesis a new automatic method for preferences elicitation based on data mining techniques. Our proposal is a two phase algorithm : (1) Extracting all contextual preferences rules from a set of user preferences and (2) Building user profile. At the end of the first phase, we notice that there is to much preference rules which satisfy the fixed constraints then in the second phase we eliminate the superfluous preferences rules. In our approach a user profile is constituted by the set of contextual preferences rules resulting of the second phase. A user profile must satisfy conciseness and soundness properties. The soundness property guarantees that the preference rules specifying the profiles are in agreement with a large set of the user preferences, and contradict a small number of them. On the other hand, conciseness implies that profiles are small sets of preference rules. We also proposed four predictions methods which use the extracted profiles. We validated our approach on a set of real-world movie rating datasets built from MovieLens and IMDB. The whole movie rating database consists of 800,156 votes from 6,040 users about 3,881 movies. The results of these experiments demonstrates that the conciseness of user profiles is controlled by the minimal agreement threshold and that even with strong reduction, the soundness of the profile remains at an acceptable level. These experiment also show that predictive qualities of some of our ranking strategies outperform SVMRank in several situations
El, Moussawi Adnan. "Clustering exploratoire pour la segmentation de données clients". Thesis, Tours, 2018. http://www.theses.fr/2018TOUR4010/document.
Pełny tekst źródłaThe research work presented in this thesis focuses on the exploration of the multiplicity of clustering solutions. The goal is to provide to marketing experts an interactive tool for exploring customer data that considers expert preferences on the space of attributes. We first give the definition of an exploratory clustering system. Then, we propose a new semi-supervised clustering method that considers user’s quantitative preferences on the analysis attributes and manages the sensitivity to these preferences. Our method takes advantage of metric learning to find a compromise solution that is both well adapted to the data structure and consistent with the expert’s preferences. Finally, we propose a prototype of exploratory clustering for customer relationship data segmentation that integrates the proposed method. The prototype also integrates visual and interaction components essential for the implementation of the exploratory clustering process
Alili, Hiba. "Intégration de données basée sur la qualité pour l'enrichissement des sources de données locales dans le Service Lake". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLED019.
Pełny tekst źródłaIn the Big Data era, companies are moving away from traditional data-warehouse solutions whereby expensive and timeconsumingETL (Extract, Transform, Load) processes are used, towards data lakes in order to manage their increasinglygrowing data. Yet the stored knowledge in companies’ databases, even though in the constructed data lakes, can never becomplete and up-to-date, because of the continuous production of data. Local data sources often need to be augmentedand enriched with information coming from external data sources. Unfortunately, the data enrichment process is one of themanual labors undertaken by experts who enrich data by adding information based on their expertise or select relevantdata sources to complete missing information. Such work can be tedious, expensive and time-consuming, making itvery promising for automation. We present in this work an active user-centric data integration approach to automaticallyenrich local data sources, in which the missing information is leveraged on the fly from web sources using data services.Accordingly, our approach enables users to query for information about concepts that are not defined in the data sourceschema. In doing so, we take into consideration a set of user preferences such as the cost threshold and the responsetime necessary to compute the desired answers, while ensuring a good quality of the obtained results
Mouloudi, Hassina. "Personnalisation de requêtes et visualisations OLAP sous contraintes". Tours, 2007. http://www.theses.fr/2007TOUR4029.
Pełny tekst źródłaPersonalization is extensively used in information retrieval and databases. It helps the user to face to diversity and the volume of information he accesses. A data warehouse stores large volumes of consolidated and historized multidimensional data to be analyzed. The data warehouse is in particular designed to support complex decision queries (OLAP queries) whose results are displayed under the form of cross tables. These results can be very large and often they cannot be visualized entirely on the display device (PDA, mobile phone, etc. ). This work aims to study the personalization of information, for a user querying a data warehouse with OLAP queries. A state of the art of works on personalization in relational databases allows us to establish their principal characteristics and adapt them to the context of exploitation of data warehouses by OLAP queries. We first propose a formalization of the concept of OLAP queries results visualizations, and we show how visualizations can be built and manipulated. Then, we propose a method for personalizing visualizations based on a user profile (including preferences and constraints). Our method corresponds to the formal definition of personalization operator added to the query language for visualizations. This operator can be implemented by transformation of a query or by transformation of the query result. We propose an implementation of this operator, which is used as a basis for a prototype allowing a user to obtain his preferred visualization when querying the data warehouse via a mobile device. This prototype allows us to validate our approach and to check its effectiveness
Boubou, Mounzer. "Contribution aux méthodes de classification non supervisée via des approches prétopologiques et d'agrégation d'opinions". Phd thesis, Université Claude Bernard - Lyon I, 2007. http://tel.archives-ouvertes.fr/tel-00195779.
Pełny tekst źródłaMokhtari, Amine. "Système personnalisé de planification d'itinéraire unimodal : une approche basée sur la théorie des ensembles flous". Rennes 1, 2011. http://www.theses.fr/2011REN1E004.
Pełny tekst źródłaBrancotte, Bryan. "Agrégation de classements avec égalités : algorithmes, guides à l'utilisateur et applications aux données biologiques". Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112184/document.
Pełny tekst źródłaThe rank aggregation problem is to build consensus among a set of rankings (ordered elements). Although this problem has numerous applications (consensus among user votes, consensus between results ordered differently by different search engines ...), computing an optimal consensus is rarely feasible in cases of real applications (problem NP-Hard). Many approximation algorithms and heuristics were therefore designed. However, their performance (time and quality of product loss) are quite different and depend on the datasets to be aggregated. Several studies have compared these algorithms but they have generally not considered the case (yet common in real datasets) that elements can be tied in rankings (elements at the same rank). Choosing a consensus algorithm for a given dataset is therefore a particularly important issue to be studied (many applications) and it is an open problem in the sense that none of the existing studies address it. More formally, a consensus ranking is a ranking that minimizes the sum of the distances between this consensus and the input rankings. Like much of the state-of-art, we have considered in our studies the generalized Kendall-Tau distance, and variants. Specifically, this thesis has three contributions. First, we propose new complexity results associated with cases encountered in the actual data that rankings may be incomplete and where multiple items can be classified equally (ties). We isolate the different "features" that can explain variations in the results produced by the aggregation algorithms (for example, using the generalized distance of Kendall-Tau or variants, pre-processing the datasets with unification or projection). We propose a guide to characterize the context and the need of a user to guide him into the choice of both a pre-treatment of its datasets but also the distance to choose to calculate the consensus. We finally adapt existing algorithms to this new context. Second, we evaluate these algorithms on a large and varied set of datasets both real and synthetic reproducing actual features such as similarity between rankings, the presence of ties and different pre-treatments. This large evaluation comes with the proposal of a new method to generate synthetic data with similarities based on a Markov chain modeling. This evaluation led to the isolation of datasets features that impact the performance of the aggregation algorithms, and to design a guide to characterize the needs of a user and advise him in the choice of the algorithm to be use. A web platform to replicate and extend these analyzes is available (rank-aggregation-with-ties.lri.fr). Finally, we demonstrate the value of using the rankings aggregation approach in two use cases. We provide a tool to reformulating the text user queries through biomedical terminologies, to then query biological databases, and ultimately produce a consensus of results obtained for each reformulation (conqur-bio.lri.fr). We compare the results to the references platform and show a clear improvement in quality results. We also calculate consensus between list of workflows established by experts in the context of similarity between scientific workflows. We note that the computed consensus agree with the expert in a very large majority of cases
Thuilier, Juliette. "Contraintes préférentielles et ordre des mots en français". Phd thesis, Université Paris-Diderot - Paris VII, 2012. http://tel.archives-ouvertes.fr/tel-00781228.
Pełny tekst źródłaAmdouni, Soumaya. "Composition de services web dans des environnements incertains". Thesis, Lyon 1, 2015. http://www.theses.fr/2015LYO10128.
Pełny tekst źródłaIn this thesis we focus on the data web services composition problem and study the impact of the uncertainty that may be associated with the output of a service on the service selection and composition processes. This work is motivated by the increasing number of application domains where data web services may return uncertain data, including the e-commerce, scientific data exploration, open web data, etc. We call such services that return uncertain data as uncertain services. In this dissertation, we propose new models and techniques for the selection and the composition of uncertain data web services. Our techniques are based on well established fuzzy and probabilistic database theories and can handle the uncertainty efficiently. First, we proposed a composition model that takes into account the user preferences. In our model, user preferences are modelled as fuzzy constraints, and services are described with fuzzy constraints to better characterize their accessed data. The composition model features also a composition algebra that allows us to rank the returned results based on their relevance to user's preferences. Second, we proposed a probabilistic approach to model the uncertainty of the data returned by uncertain data services. Specifically, we extended the web service description standards (e.g., WSDL) to represent the outputs' probabilities. We also extended the service invocation process to take into account the uncertainty of input data. This extension is based on the possible worlds theory used in the probabilistic databases. We proposed also a set of probability-aware composition operators that are necessary to orchestrate uncertain data services. Since a composition may accept multiple orchestration plans and not all of them compute the correct probabilities of outputs, we defined a set of conditions to check if a plan is safe (i.e., computes the probabilities correctly) or not. We implemented our different techniques and applied them to the real-estate and e-commerce domains. We provide a performance study of our different composition techniques
Gras, Benjamin. "Les oubliés de la recommandation sociale". Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0017/document.
Pełny tekst źródłaA recommender system aims at providing relevant resources to a user, named the active user. To allow this recommendation, the system exploits the information it has collected about the active user or about resources. The collaborative filtering (CF) is a widely used recommandation approach. The data exploited by CF are the preferences expressed by users on resources. CF is based on the assumption that preferences are consistent between users, allowing a user's preferences to be inferred from the preferences of other users. In a CF-based recommender system, at least one user community has to share the preferences of the active user to provide him with high quality recommendations. Let us define a specific preference as a preference that is not shared by any group of user. A user with several specific preferences will likely be poorly served by a classic CF approach. This is the problem of Grey Sheep Users (GSU). In this thesis, I focus on three separate questions. 1) What is a specific preference? I give an answer by proposing associated hypotheses that I validate experimentally. 2) How to identify GSU in preference data? This identification is important to anticipate the low quality recommendations that will be provided to these users. I propose numerical indicators to identify GSU in a social recommendation dataset. These indicators outperform those of the state of the art and allow to isolate users whose quality of recommendations is very low. 3) How can I model GSU to improve the quality of the recommendations they receive? I propose new recommendation approaches to allow GSU to benefit from the opinions of other users
Pralet, Cédric. "Un cadre algébrique général pour représenter et résoudre des problèmes de décision séquentielle avec incertitudes, faisabilités et utilités". Toulouse, ENSAE, 2006. http://www.theses.fr/2006ESAE0013.
Pełny tekst źródłaAbidi, Amna. "Imperfect RDF Databases : From Modelling to Querying". Thesis, Chasseneuil-du-Poitou, Ecole nationale supérieure de mécanique et d'aérotechnique, 2019. http://www.theses.fr/2019ESMA0008/document.
Pełny tekst źródłaThe ever-increasing interest of RDF data on the Web has led to several and important research efforts to enrich traditional RDF data formalism for the exploitation and analysis purpose. The work of this thesis is a part of the continuation of those efforts by addressing the issue of RDF data management in presence of imperfection (untruthfulness, uncertainty, etc.). The main contributions of this dissertation are as follows. (1) We tackled the trusted RDF data model. Hence, we proposed to extend the skyline queries over trust RDF data, which consists in extracting the most interesting trusted resources according to user-defined criteria. (2) We studied via statistical methods the impact of the trust measure on the Trust-skyline set.(3) We integrated in the structure of RDF data (i.e., subject-property-object triple) a fourth element expressing a possibility measure to reflect the user opinion about the truth of a statement.To deal with possibility requirements, appropriate framework related to language is introduced, namely Pi-SPARQL, that extends SPARQL to be possibility-aware query language.Finally, we studied a new skyline operator variant to extract possibilistic RDF resources that are possibly dominated by no other resources in the sense of Pareto optimality
Ben, Messaoud Rim. "Towards efficient mobile crowdsensing assignment and uploading schemes". Thesis, Paris Est, 2017. http://www.theses.fr/2017PESC1031/document.
Pełny tekst źródłaThe ubiquity of sensors-equipped mobile devices has enabled people to contribute data via crowdsensing systems. This emergent paradigm comes with various applications. However, new challenges arise given users involvement in data collection process. In this context, we introduce collaborative sensing schemes which tackle four main questions: How to assign sensing tasks to maximize data quality with energy-awareness? How to minimize the processing time of sensing tasks? How to motivate users to dedicate part of their resources to the crowdsensing process ? and How to protect participants privacy and not impact data utility when reporting collected sensory data ? First, we focus on the fact that smart devices are energy-constrained and develop task assignment methods that aim to maximize sensor data quality while minimizing the overall energy consumption of the data harvesting process. The resulting contribution materialized as a Quality and Energy-aware Mobile Sensing Scheme (QEMSS) defines first data quality metrics then models and solves the corresponding optimization problem using a Tabu-Search based heuristic. Moreover, we assess the fairness of the resulted scheduling by introducing F-QEMSS variant. Through extensive simulations, we show that both solutions have achieved competitive data quality levels when compared to concurrent methods especially in situations where the process is facing low dense sensing areas and resources shortcomings. As a second contribution, we propose to distribute the assignment process among participants to minimize the average sensing time and processing overload com- pared to a fully centralized approach. Thus, we suggest to designate some participants to carry extra sensing tasks and delegate them to appropriate neighbors. The new assign- ment is based on predicting users local mobility and sensing preferences. Accordingly, we develop two new greedy-based assignment schemes, one only Mobility-aware (MATA) and the other one accounting for both preferences and mobility (P-MATA), and evaluate their performances. Both MATA and P-MATA consider a voluntary sensing process and show that accounting for users preferences minimize the sensing time. Having showing that, our third contribution in this thesis is conceived as an Incentives-based variant, IP-MATA+. IP-MATA+ incorporates rewards in the users choice model and proves their positive impact on enhancing their commitment especially when the dedicated budget is shared function of contributed data quality. Finally, our fourth and last contribution addresses the seizing of users privacy concerns within crowdsensing systems. More specifically, we study the minimization of the incurred privacy leakage in data uploading phase while accounting for the possible quality regression. That is, we assess simultaneously the two competing goals of ensuring queriers required data utility and protecting participants’ sensitive information. Thus, we introduce a trust entity to the crowdsensing traditional system. This entity runs a general privacy-preserving mechanism to release a distorted version of sensed data that responds to a privacy-utility trade-off. The proposed mechanism, called PRUM, is evaluated on three sensing datasets, different adversary models and two main data uploading scenarios. Results show that a limited distortion on collected data may ensure privacy while maintaining about 98% of the required utility level.The four contributions of this thesis tackle competing issues in crowdsensing which paves the way at facilitating its real implementation and aims at broader deployment
Marsaudon, Antoine. "Impact of health shocks on personality traits, economic preferences, and risky behaviors". Thesis, Paris 1, 2019. http://www.theses.fr/2019PA01E013.
Pełny tekst źródłaThis PhD dissertation aims to document whether personality traits and economic preferences are stable parameters after the occurrence of a significant health event. Given the massive impacts of traits and preferences on life outcomes, it is necessary to provide information as to how much these can change. Results show that traits are slightly modified when individuals face a health event (Chapter 1). Economic preferences, however, do not change after the occurrence of such events (Chapter 2). The finding that preferences are stable might call for a genetic transmission of these parameters. However, results show that economic preferences are not determined in-utero (Chapter 3). Additionally, individuals facing health events are more likely to adopt healthier behaviors than those who do not face such events (Chapter 4). These findings can be used by economic researchers and policymakers. For the former, relying solely upon individual fixed-effect estimations or first difference methods might not account for trait variation. For the latter, changes in traits might modify the willingness to invest in various health, education and labor outcomes, subsequently influencing macroeconomic performance
Tapucu, Dilek. "Un modèle générique pour la capture de préférences dans les bases de données à base ontologique". Phd thesis, 2010. http://tel.archives-ouvertes.fr/tel-00518476.
Pełny tekst źródła