Dissertations / Theses on the topic 'Stockage de données dans l’ADN'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 39 dissertations / theses for your research on the topic 'Stockage de données dans l’ADN.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Dimopoulou, Melpomeni. "Techniques de codage pour le stockage à long terme d’images numériques dans l’ADN synthétique." Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4073.
Full textData explosion is one of the greatest challenges of digital evolution, causing the storage demand to grow at such a rate that it cannot compete with the actual capabilities of devices. The digital universe is forecast to grow to over 175 zettabytes by 2025 while 80% is infrequently accessed (“cold” data), yet safely archived in off-line tape drives due to security and regulatory compliance reasons. At the same time, conventional storage devices have a limited lifespan of 10 to 20 years and therefore should be frequently replaced to ensure data reliability, a process which is expensive both in terms of money and energy. Recent studies have shown that due to its biological properties, DNA is a very promising candidate for the long-term archiving of “cold” digital data for centuries or even longer under the condition that the information is encoded in a quaternary stream made up of the symbols A, T, C and G, to represent the 4 components of the DNA molecule, while also respecting some important encoding constraints. Pioneering works have proposed different algorithms for DNA coding leaving room for further improvement. In this thesis we present some novel image coding techniques for the efficient storage of digital images into DNA. We implemented a novel fixed length algorithm for the construction of a robust quaternary code that respects the biological constraints and proposed two different mapping functions to allow flexibility according to the encoding needs. Furthermore, one of the main challenges of DNA data storage being the expensive cost of DNA synthesis, we make a very first attempt to introduce controlled compression in the proposed encoding workflow. The, proposed codec is competitive compared to the state of the art. Furthermore, our end-to-end coding/decoding solution has been experimented in a wet lab experiment to prove feasibility of the theoretical study in practice
Berton, Chloé. "Sécurité des données stockées sur molécules d’ADN." Electronic Thesis or Diss., Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2024. http://www.theses.fr/2024IMTA0431.
Full textThe volume of digital data produced worldwide every year is increasing exponentially, and current storage solutions are reaching their limits. In this context, data storage on DNA molecules holds great promise. Storing up to 10¹⁸ bytes per gram of DNA for almost no energy consumption, it has a lifespan 100 times longer than hard disks. As this storage technology is still under development, the opportunity presents itself to natively integrate data security mechanisms. This is the aim of this thesis. Our first contribution is a risk analysis of the entire storage chain, which has enabled us to identify vulnerabilities in digital and biological processes, particularly in terms of confidentiality, integrity, availability and traceability. A second contribution is the identification of elementary biological operators for simple manipulations of DNA. Using these operators, we have developed a DNACipher encryption solution that requires biomolecular decryption (DNADecipher) of the molecules before the data can be read correctly. This third contribution, based on enzymes, required the development of a coding algorithm for digital data into DNA sequences, a contribution called DSWE. This algorithm respects the constraints of biological processes (e.g. homopolymers) and our encryption solution. Our final contribution is an experimental validation of our secure storage chain. This is the first proof of concept showing that it is possible to secure this new storage medium using biomolecular manipulations
Bouabache, Fatiha. "Stockage fiable des données dans les grilles, application au stockage des images de checkpoint." Paris 11, 2010. http://www.theses.fr/2010PA112329.
Full textRollback/recovery solutions rely on checkpoint storage reliability (after a failure, if the checkpoint images are not available, the rollback operation fails). The goal of this thesis is to propose a reliable and an efficient checkpoint storage service. By reliable, we mean that whatever the scenario of failures is, as long as it respects the assumptions made by the algorithms, the checkpoint images are still available. And we mean by efficient, minimizing the time required to transfer and to store the checkpoint images. This will minimize the global execution time of the checkpoint waves. To ensure those two points (reliability and efficiency), we propose: 1. A new coordinated checkpoint protocol which tolerates checkpoint server failures and clusters failures, and ensures a checkpoint storage reliability in a grid environment; 2. A distributed storage service structured on three layers architecture: a) The replication layer: to ensure the checkpoint storage reliability, we propose to replicate the images over the network. Ln this direction, we propose two hierarchical replication strategies adapted to the considered architecture and that exploit the locality of checkpoint images in order to minimize inter-cluster communication. B) The scheduling layer: at this level we work on the storage efficiency by reducing the data transfer time. We propose an algorithm based on the uniform random sampling of possible schedules. C) The scheduling engine: at this layer, we develop a tool that implements the scheduling plan calculated in the scheduling layer
Obame, Meye Pierre. "Sûreté de fonctionnement dans le nuage de stockage." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S091/document.
Full textThe quantity of data in the world is steadily increasing bringing challenges to storage system providers to find ways to handle data efficiently in term of dependability and in a cost-effectively manner. We have been interested in cloud storage which is a growing trend in data storage solution. For instance, the International Data Corporation (IDC) predicts that by 2020, nearly 40% of the data in the world will be stored or processed in a cloud. This thesis addressed challenges around data access latency and dependability in cloud storage. We proposed Mistore, a distributed storage system that we designed to ensure data availability, durability, low access latency by leveraging the Digital Subscriber Line (xDSL) infrastructure of an Internet Service Provider (ISP). Mistore uses the available storage resources of a large number of home gateways and Points of Presence for content storage and caching facilities. Mistore also targets data consistency by providing multiple types of consistency criteria on content and a versioning system. We also considered the data security and confidentiality in the context of storage systems applying data deduplication which is becoming one of the most popular data technologies to reduce the storage cost and we design a two-phase data deduplication that is secure against malicious clients while remaining efficient in terms of network bandwidth and storage space savings
Secret, Ghislain. "La maintenance des données dans les systèmes de stockage pair à pair." Amiens, 2009. http://www.theses.fr/2009AMIE0111.
Full textPeer to peer systems are designed to share resources on the Internet. The independence of the architecture from a centralized server provides the peer-to-peer networks a very high fault tolerance (no peer is essential to the functioning of the network). This property makes the use of this architecture very suitable for permanent storage of data on a large scale. However, peer to peer systems are characterised by peer’s volatility. Peers connect and disconnect randomly. The challenge is to ensure the continuity of data in a storage media constantly changing. For this, to cope with peer’s volatility, data redundancy schemes coupled with reconstruction mechanism of lost data are introduced. But the reconstructions needed to maintain the continuity of data are not neutral in terms of burden on the system. To investigate factors that impact the higher the data maintenance cost, a model of peer to peer storage system was designed. This model is based on an IDA (Information Dispersal Algorithm) redundancy scheme. Built on this model, a simulator was developed and the system behaviour for the cost of regeneration of the data was analyzed. Two reconstruction strategies are observed. The first mechanism is based on a threshold from the level of data redundancy. It requires constant monitoring of the state data. The second strategy involves a number of reconstructions by a system of quota allocation for a defined period of time. It is less comfortable psychologically because it significantly reduces the control of the data state by abstracting the threshold mechanism. Based on a stochastic analysis of the strategies, keys are provided to define the parameters of the system according to the target level of durability desired
Soyez, Olivier. "Stockage dans les systèmes pair à pair." Phd thesis, Université de Picardie Jules Verne, 2005. http://tel.archives-ouvertes.fr/tel-00011443.
Full textDans un premier temps, nous avons créé un prototype Us et conçu une interface utilisateur, nommée UsFS, de type système de fichiers. Un procédé de journalisation des données est inclus dans UsFS.
Ensuite, nous nous sommes intéressés aux distributions de données au sein du réseau Us. Le but de ces distributions est de minimiser le dérangement occasionné par le processus de reconstruction pour chaque pair. Enfin, nous avons étendu notre schéma de distribution pour gérer le comportement dynamique des pairs et prendre en compte les corrélations de panne.
Fournié, Laurent Henri. "Stockage et manipulation transactionnels dans une base de données déductives à objets : techniques et performances." Versailles-St Quentin en Yvelines, 1998. http://www.theses.fr/1998VERS0017.
Full textRomito, Benoit. "Stockage décentralisé adaptatif : autonomie et mobilité des données dans les réseaux pair-à-pair." Caen, 2012. http://www.theses.fr/2012CAEN2072.
Full textLe, Hung-Cuong. "Optimisation d'accès au médium et stockage de données distribuées dans les réseaux de capteurs." Besançon, 2008. http://www.theses.fr/2008BESA2052.
Full textWireless sensor network is a very hot research topic tendency for the last few years. This technology can be applied into different domains as environment, industry, commerce, medicine, military etc. Depending on the application type, the problems and requirements might be different. In this thesis, we are interested in two major problems: the medium access control and the distributed data storage. The document is divided to two parts where the first part is a state of the art of different existing works and the second part describes our contribution. In the first contribution, we have proposed two MAC protocols. The first one optimizes the wireless sensor networks lifetime for surveillance applications and the second one reduces the transmission latency in event-driven wireless sensor networks for critical applications. In the second contribution, we have worked with several data storage models in wireless sensor network and we focus on the data-centric storage model. We have proposed a clustering structure for sensors to improve the routing and reduce the number of transmissions in order to prolong the network lifetime
Borba, Ribeiro Heverson. "L'Exploitation de Codes Fontaines pour un Stockage Persistant des Données dans les Réseaux d'Overlay Structurés." Phd thesis, Université Rennes 1, 2012. http://tel.archives-ouvertes.fr/tel-00763284.
Full textCarpen-Amarie, Alexandra. "Utilisation de BlobSeer pour le stockage de données dans les Clouds: auto-adaptation, intégration, évaluation." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2011. http://tel.archives-ouvertes.fr/tel-00696012.
Full textDandoush, Abdulhalim. "L'Analyse et l'Optimisation des Systèmes de Stockage de Données dans les Réseaux Pair-à-Pair." Phd thesis, Université de Nice Sophia-Antipolis, 2010. http://tel.archives-ouvertes.fr/tel-00470493.
Full textTraboulsi, Salam. "Virtualisation du stockage dans les grilles informatiques : administration et monitoring." Toulouse 3, 2008. http://thesesups.ups-tlse.fr/385/.
Full textVirtualization in grid environments is a recent way to improve platform usage. ViSaGe is a middleware designed to provide set of functionalities needed for storage virtualization: transparent reliable remote access to data and distributed data management. ViSaGe aggregates distributed physical storage resources. However, ensuring the performances of data access in grid environment is a major issue, as large amount of data are stored and constantly accessed, and directly involved into tasks execution time. Especially, the placement and selection of replicated data are made particularly difficult because of the dynamic nature of grid environments -- grid nodes workload variations. The workload variations represent the state of the system resources (CPU, disks and networks). These variations are mainly perceived by a monitoring system. Several monitoring systems exist in the literature. They monitor system resources consumption and applications but none of these systems presents the whole of the pertinent characteristics for ViSaGe. ViSaGe needs a system that analyzes nodes workload during runtime execution for improving data storage management. Therefore, ViSaGe Administration and monitoring service, namely Admon, is proposed. We present Admon efficiency that allowing to dynamically placing data according to resources usage ensuring the best performances while limiting the monitoring overhead
Jaiman, Vikas. "Amélioration de la prédictibilité des performances pour les environnements de stockage de données dans les nuages." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM016/document.
Full textToday, users of interactive services such as e-commerce, web search have increasingly high expectations on the performance and responsiveness of these services. Indeed, studies have shown that a slow service (even for short periods of time) directly impacts the revenue. Enforcing predictable performance has thus been a priority of major service providers in the last decade. But avoiding latency variability in distributed storage systems is challenging since end user requests go through hundreds of servers and performance hiccups at any of these servers may inflate the observed latency. Even in well-provisioned systems, factors such as the contention on shared resources or the unbalanced load between servers affect the latencies of requests and in particular the tail (95th and 99th percentile) of their distribution.The goal of this thesis to develop mechanisms for reducing latencies and achieve performance predictability in cloud data stores. One effective countermeasure for reducing tail latency in cloud data stores is to provide efficient replica selection algorithms. In replica selection, a request attempting to access a given piece of data (also called value) identified by a unique key is directed to the presumably best replica. However, under heterogeneous workloads, these algorithms lead to increased latencies for requests with a short execution time that get scheduled behind requests with large execution times. We propose Héron, a replica selection algorithm that supports workloads of heterogeneous request execution times. We evaluate Héron in a cluster of machines using a synthetic dataset inspired from the Facebook dataset as well as two real datasets from Flickr and WikiMedia. Our results show that Héron outperforms state-of-the-art algorithms by reducing both median and tail latency by up to 41%.In the second contribution of the thesis, we focus on multiget workloads to reduce the latency in cloud data stores. The challenge is to estimate the bottleneck operations and schedule them on uncoordinated backend servers with minimal overhead. To reach this objective, we present TailX, a task aware multiget scheduling algorithm that reduces tail latencies under heterogeneous workloads. We implement TailX in Cassandra, a widely used key-value store. The result is an improved overall performance of the cloud data stores for a wide variety of heterogeneous workloads
Schaaf, Thomas. "Couplage inversion et changement d'échelle pour l'intégration des données dynamiques dans les modèles de réservoirs pétroliers." Paris 9, 2003. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=2003PA090046.
Full textMoreira, José. "Un modèle d'approximation pour la représentation du mouvement dans les bases de données spatiales." Paris, ENST, 2001. http://www.theses.fr/2001ENST0016.
Full textCutillo, Leucio Antonio. "Protection des données privées dans les réseaux sociaux." Electronic Thesis or Diss., Paris, ENST, 2012. http://www.theses.fr/2012ENST0020.
Full textOnline Social Network (OSN) applications allow users of all ages and educational background to easily share a wide range of personal information with a theoretically unlimited number of partners. This advantage comes at the cost of increased security and privacy exposures for users, since in all existing OSN applications, to underpin a promising business model, users' data is collected and stored permanently at the databases of the service provider, which potentially becomes a “Big Brother” capable of exploiting this data in many ways that can violate the privacy of individual users or user groups. This thesis suggests and validates a new approach to tackle these security and privacy problems. In order to ensure users' privacy in the face of potential privacy violations by the provider, the suggested approach adopts a distributed architecture relying on cooperation among a number of independent parties that are also the users of the online social network application. The second strong point of the suggested approach is to capitalize on the trust relationships that are part of social networks in real life in order to cope with the problem of building trusted and privacy-preserving mechanisms as part of the online application. Based on these main design principles, a new distributed Online Social Network, namely Safebook, is proposed: Safebook leverages on real life trust and allows users to maintain the control on the access and the usage of their own data. The prototype of Safebook is available at www.safebook.eu
Kerhervé, Brigitte. "Vues relationnelles : implantation dans les systèmes de gestion de bases de données centralisés et répartis." Paris 6, 1986. http://www.theses.fr/1986PA066090.
Full textCutillo, Leucio Antonio. "Protection des données privées dans les réseaux sociaux." Phd thesis, Télécom ParisTech, 2012. http://pastel.archives-ouvertes.fr/pastel-00932360.
Full textChikhaoui, Amina. "Vers une approche intelligente de placement de données dans un cloud distribué basé sur un système de stockage hybride." Electronic Thesis or Diss., Brest, 2022. http://www.theses.fr/2022BRES0024.
Full textCloud federation makes it possible to seamlessly extend the resources of Cloud Service Providers (CSP) in order to provide a better Quality of Service (QoS) to customers without additional deployment costs. Storage as a Service (StaaS), is one of the main Cloud services offered to customers. For such a service, storage Input/Output (I/O) performance and network latency are among the most important metrics considered by customers. In effect, transactions for some database queries spend 90% of the execution time in I/O operations. In order to satisfy customers, some Cloud companies already include latency guarantees in their Service Level Agreements (SLA) and customers can pay additional fees to further reduce latency. This thesis addresses the data placement problem for a CSP that is part of a federation. Indeed,offering attractive and inexpensive services is a big challenge for CSP. Our goal is to pro-vide intelligent approaches for a better data placement that minimizes the cost of placement for the provider while satisfying the customers QoS requirements.This approach must take into account the heterogeneity of internal and external storage resources in terms of several parameters (such as capacity, performance, pricing) as well as customer characteristics and requirements.Despite the fact that many data placement strategies have been proposed for hybrid storage systems, they are not generalizable to every architecture. Indeed, a placement strategy must be designed according to the system architecture for which it is proposed and the target objectives
Duminuco, Alessandro. "Redondance et maintenance des données dans les systèmes de sauvegarde de fichiers pair-à-pair." Phd thesis, Paris, Télécom ParisTech, 2009. https://pastel.hal.science/pastel-00005541.
Full textThe amount of digital data produced by users, such as photos, videos, and digital documents, has grown tremendously over the last decade. These data are very valuable and need to be backed up safely. The research community has shown an increasing interest in the use of peer-to-peer systems for file backup. The key property that makes peer-to-peer systems appealing is self-scaling, i. E. As more peers become part of the system the service capacity increases along with the service demand. The design of a peer-to-peer file backup system is a complex task and presents a considerable number of challenges. Peers can be intermittently connected or can fail at a rate that is considerably higher than in the case of centralized storage systems. Our interest focused particularly on how to efficiently provide reliable storage of data applying appropriate redundancy schemes and adopting the right mechanisms to maintain this redundancy. This task is not trivial since data maintenance in such systems may require significant resources in terms of storage space and communication bandwidth. Our contribution is twofold. First, we study erasure coding redundancy schemes able to combine the bandwidth efficiency of replication with the storage efficiency of classical erasure codes. In particular, we introduce and analyze two new classes of codes, namely Regenerating Codes and Hierarchical Codes. Second, we propose a proactive adaptive repair scheme, which combines the adaptiveness of reactive systems with the smooth bandwidth usage of proactive systems, generalizing the two existing approaches
Duminuco, Alessandro. "Redondance et maintenance des données dans les systèmes de sauvegarde de fichiers pair-à-pair." Phd thesis, Télécom ParisTech, 2009. http://pastel.archives-ouvertes.fr/pastel-00005541.
Full textRelaza, Théodore Jean Richard. "Sécurité et disponibilité des données stockées dans les nuages." Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30009/document.
Full textWith the development of Internet, Information Technology was essentially based on communications between servers, user stations, networks and data centers. Both trends "making application available" and "infrastructure virtualization" have emerged in the early 2000s. The convergence of these two trends has resulted in a federator concept, which is the Cloud Computing. Data storage appears as a central component of the problematic associated with the move of processes and resources in the cloud. Whether it is a simple storage externalization for backup purposes, use of hosted software services or virtualization in a third-party provider of the company computing infrastructure, data security is crucial. This security declines according to three axes: data availability, integrity and confidentiality. The context of our work concerns the storage virtualization dedicated to Cloud Computing. This work is carried out under the aegis of SVC (Secured Virtual Cloud) project, financed by the National Found for Digital Society "Investment for the future". This led to the development of a storage virtualization middleware, named CloViS (Cloud Virtualized Storage), which is entering a valorization phase driven by SATT Toulouse-Tech-Transfer. CloViS is a data management middleware developped within the IRIT laboratory. It allows virtualizing of distributed and heterogeneous storage resources, with uniform and seamless access. CloViS aligns user needs and system availabilities through qualities of service defined on virtual volumes. Our contribution in this field concerns data distribution techniques to improve their availability and the reliability of I/O operations in CloViS. Indeed, faced with the explosion in the amount of data, the use of replication can not be a permanent solution. The use of "Erasure Resilient Code" or "Threshold Schemes" appears as a valid alternative to control storage volumes. However, no data consistency protocol is, to date, adapted to these new data distribution methods. For this reason, we propose to adapt these different data distribution techniques. We then analyse these new protocols, highlighting their respective advantages and disadvantages. Indeed, the choice of a data distribution technique and the associated data consistency protocol is based on performance criteria, especially the availability and the number of messages exchanged during the read and write operations or the use of system resources (such as storage space used)
Monteiro, Julian. "Modélisation et analyse des systèmes de stockage fiable de données dans des réseaux pair-à-pair." Phd thesis, Université de Nice Sophia-Antipolis, 2010. http://tel.archives-ouvertes.fr/tel-00545724.
Full textGoëta, Samuel. "Instaurer des données, instaurer des publics : une enquête sociologique dans les coulisses de l'open data." Electronic Thesis or Diss., Paris, ENST, 2016. http://www.theses.fr/2016ENST0045.
Full textAs more than fifty countries have launched an open data policy, this doctoral dissertation investigates on the emergence and implementation of such policies. It is based on the analysis of public sources and an ethnographic inquiry conducted in seven French local authorities and institutions. By retracing six moments of definitions of the “open data principles” and their implementation by a French institution, Etalab, this work shows how open data has brought attention to data, particularly in their raw form, considered as an untapped resource, the “new oil” lying under the organisations. The inquiry shows that the process of opening generally begins by a phase of identification marked by progressive and uncertain explorations. It allows to understand that data are progressively instantiated from management files into data. Their circulation provoke frictions: to leave the sociotechnical network of organisations, data generally go through validation circuits and chains of treatment. Besides, data must often undergo important treatments before their opening in order to become intelligible by machines as well as humans. This thesis shows eventually that data publics are also instantiated as they are expected to visualize, inspect and process the data. Data publics are instantiated through various tools, which compose another area of the invisible work of open data projects. Finally, it appears from this work that the possible legal requirement to open data asks a fundamental question, “what is data?” Instead of reducing data to a relational category, which would apply to any informational material, studied cases show that they generally are applied when data are a starting point of sociotechnical networks dedicated to their circulation, their exploitation and their visibility
Chihoub, Houssem Eddine. "Managing consistency for big data applications : tradeoffs and self-adaptiveness." Thesis, Cachan, Ecole normale supérieure, 2013. http://www.theses.fr/2013DENS0059/document.
Full textIn the era of Big Data, data-intensive applications handle extremely large volumes of data while requiring fast processing times. A large number of such applications run in the cloud in order to benefit from cloud elasticity, easy on-demand deployments, and cost-efficient Pays-As-You-Go usage. In this context, replication is an essential feature in the cloud in order to deal with Big Data challenges. Therefore, replication therefore, enables high availability through multiple replicas, fast data access to local replicas, fault tolerance, and disaster recovery. However, replication introduces the major issue of data consistency across different copies. Consistency management is a critical for Big Data systems. Strong consistency models introduce serious limitations to systems scalability and performance due to the required synchronization efforts. In contrast, weak and eventual consistency models reduce the performance overhead and enable high levels of availability. However, these models may tolerate, under certain scenarios, too much temporal inconsistency. In this Ph.D thesis, we address this issue of consistency tradeoffs in large-scale Big Data systems and applications. We first, focus on consistency management at the storage system level. Accordingly, we propose an automated self-adaptive model (named Harmony) that scale up/down the consistency level at runtime when needed in order to provide as high performance as possible while preserving the application consistency requirements. In addition, we present a thorough study of consistency management impact on the monetary cost of running in the cloud. Hereafter, we leverage this study in order to propose a cost efficient consistency tuning (named Bismar) in the cloud. In a third direction, we study the consistency management impact on energy consumption within the data center. According to our findings, we investigate adaptive configurations of the storage system cluster that target energy saving. In order to complete our system-side study, we focus on the application level. Applications are different and so are their consistency requirements. Understanding such requirements at the storage system level is not possible. Therefore, we propose an application behavior modeling that apprehend the consistency requirements of an application. Based on the model, we propose an online prediction approach- named Chameleon that adapts to the application specific needs and provides customized consistency
Bahloul, Khaled. "Optimisation combinée des coûts de transport et de stockage dans un réseau logistique dyadique, multi-produits avec demande probabiliste." Phd thesis, INSA de Lyon, 2011. http://tel.archives-ouvertes.fr/tel-00695275.
Full textLima, Jose Valdeni de. "Gestion d'objects composes dans un SGBD : cas particulier des documents structures." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 1990. http://hdl.handle.net/10183/18391.
Full textGoeta, Samuel. "Instaurer des données, instaurer des publics : une enquête sociologique dans les coulisses de l'open data." Thesis, Paris, ENST, 2016. http://www.theses.fr/2016ENST0045/document.
Full textAs more than fifty countries have launched an open data policy, this doctoral dissertation investigates on the emergence and implementation of such policies. It is based on the analysis of public sources and an ethnographic inquiry conducted in seven French local authorities and institutions. By retracing six moments of definitions of the “open data principles” and their implementation by a French institution, Etalab, this work shows how open data has brought attention to data, particularly in their raw form, considered as an untapped resource, the “new oil” lying under the organisations. The inquiry shows that the process of opening generally begins by a phase of identification marked by progressive and uncertain explorations. It allows to understand that data are progressively instantiated from management files into data. Their circulation provoke frictions: to leave the sociotechnical network of organisations, data generally go through validation circuits and chains of treatment. Besides, data must often undergo important treatments before their opening in order to become intelligible by machines as well as humans. This thesis shows eventually that data publics are also instantiated as they are expected to visualize, inspect and process the data. Data publics are instantiated through various tools, which compose another area of the invisible work of open data projects. Finally, it appears from this work that the possible legal requirement to open data asks a fundamental question, “what is data?” Instead of reducing data to a relational category, which would apply to any informational material, studied cases show that they generally are applied when data are a starting point of sociotechnical networks dedicated to their circulation, their exploitation and their visibility
Vasilopoulos, Dimitrios. "Reconciling cloud storage functionalities with security : proofs of storage with data reliability and secure deduplication." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS399.
Full textIn this thesis we study in depth the problem of verifiability in cloud storage systems. We study Proofs of Storage -a family of cryptographic protocols that enable a cloud storage provider to prove to a user that the integrity of her data has not been compromised- and we identify their limitations with respect to two key characteristics of cloud storage systems, namely, reliable data storage with automatic maintenance and data deduplication. To cope with the first characteristic, we introduce the notion of Proofs of Data Reliability, a comprehensive verification scheme that aims to resolve the conflict between reliable data storage verification and automatic maintenance. We further propose two Proofs of Data Reliability schemes, namely POROS and PORTOS, that succeed in verifying reliable data storage and, at the same time, enable the cloud storage provider to autonomously perform automatic maintenance operations. As regards to the second characteristic, we address the conflict between Proofs of Storage and deduplication. More precisely, inspired by previous attempts in solving the problem of deduplicating encrypted data, we propose message-locked PoR, a solution that combines Proofs of Storage with deduplication. In addition, we propose a novel message-locked key generation protocol which is more resilient against off-line dictionary attacks compared to existing solutions
Sun, Yan. "Simulation du cycle biogéochimique du phosphore dans le modèle de surface terrestre ORCHIDEE : évaluation par rapport à des données d'observation locales et mondiales." Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASJ001.
Full textPhosphorus (P) plays a critical role in controlling metabolic processes, soil organic matter dynamics, plant growth and ecosystem productivity, thereby affecting greenhouse gas balance (GHG) of land ecosystems. A small number of land surface models have incorporated P cycles but their predictions of GHG balances remain highly uncertain. The reasons are: (1) scarce benchmarking data for key P-related processes (e.g. continental to global scale gridded datasets), (2) lack of comprehensive global evaluation strategy tailored for d P processes and interlinkages with carbon and nitrogen (N) cycles, and (3) insufficient model calibration limited by the high computation cost to simulate coupled CNP cycles which operate on timescales of minutes to millenia. Addressing those research gaps, I apply a combination of statistical methods (machine learning), LSMs and observational data among various scales.Firstly (Chapter 2), to address the lack of benchmarking data, I applied two machine-learning methods with the aim to produce spatial gridded maps of acid phosphatase (AP) activity on continental scale by scaling up scattered site observations of potential AP activity. AP secreted by fungi, bacteria and plant roots play an important role in recycling of soil P via transforming unavailable organic P into assimilable phosphate. The back-propagation artificial network (BPN) method that was chosen explained 58% of AP variability and was able to identify the gradients in AP along three transects in Europe. Soil nutrients (total nitrogen, total P and labile organic P) and climatic controls (annual precipitation, mean annual temperature and temperature amplitude) were detected to be the dominant factors influencing AP variations in space.Secondly (Chapter 3), I evaluated the performance of the global version of the land surface model ORCHIDEE-CNP (v1.2) using the data from chapter 2 as well as additional data from remote-sensing, ground-based measurement networks and ecological databases. Simulated components of the N and P cycle at different levels of aggregation (from local to global) are in good agreement with data-driven estimates. We identified model biases, in the simulated large-scale patterns of leaf and soil stoichiometry and plant P use efficiency, which point towards an underestimation of P availability towards the poles. Based on our analysis, we propose ways to address the model biases by giving priority to better representing processes of soil organic P mineralization and soil inorganic P transformation.Lastly (Chapter 4), I designed and tested a Machine Learning (ML)-based procedure for acceleration of the equilibration of biogeochemical cycles to boundary conditions (spinup) which is causing the low computational efficiency of current P-enabled LSMs. This ML-based acceleration approach (MLA) requires to spin-up only a small subset of model pixels (14.1%) from which the equilibrium state of the remaining pixels is estimated by ML. MLA predicts the equilibrium state of soil, biomass and litter C, N and P on both PFT and global scale sufficiently well as indicated by the minor error introduced in simulating current land carbon balance. The computational consumption of MLA is about one order of magnitude less than the currently used approach, which opens the opportunity of data assimilation using the ever-growing observation datasets.In the outlook, specific applications of the MLA approach and future research priorities are discussed to further improve the reliability and robustness of phosphorus-enabled land surface models
Ikken, Sonia. "Efficient placement design and storage cost saving for big data workflow in cloud datacenters." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2017. http://www.theses.fr/2017TELE0020.
Full textThe typical cloud big data systems are the workflow-based including MapReduce which has emerged as the paradigm of choice for developing large scale data intensive applications. Data generated by such systems are huge, valuable and stored at multiple geographical locations for reuse. Indeed, workflow systems, composed of jobs using collaborative task-based models, present new dependency and intermediate data exchange needs. This gives rise to new issues when selecting distributed data and storage resources so that the execution of tasks or job is on time, and resource usage-cost-efficient. Furthermore, the performance of the tasks processing is governed by the efficiency of the intermediate data management. In this thesis we tackle the problem of intermediate data management in cloud multi-datacenters by considering the requirements of the workflow applications generating them. For this aim, we design and develop models and algorithms for big data placement problem in the underlying geo-distributed cloud infrastructure so that the data management cost of these applications is minimized. The first addressed problem is the study of the intermediate data access behavior of tasks running in MapReduce-Hadoop cluster. Our approach develops and explores Markov model that uses spatial locality of intermediate data blocks and analyzes spill file sequentiality through a prediction algorithm. Secondly, this thesis deals with storage cost minimization of intermediate data placement in federated cloud storage. Through a federation mechanism, we propose an exact ILP algorithm to assist multiple cloud datacenters hosting the generated intermediate data dependencies of pair of files. The proposed algorithm takes into account scientific user requirements, data dependency and data size. Finally, a more generic problem is addressed in this thesis that involve two variants of the placement problem: splittable and unsplittable intermediate data dependencies. The main goal is to minimize the operational data cost according to inter and intra-job dependencies
Bellec, Matthieu. "Nanostructuration par laser femtoseconde dans un verre photo-luminescent." Phd thesis, Université Sciences et Technologies - Bordeaux I, 2009. http://tel.archives-ouvertes.fr/tel-00459311.
Full textKaaniche, Nesrine. "Cloud data storage security based on cryptographic mechanisms." Thesis, Evry, Institut national des télécommunications, 2014. http://www.theses.fr/2014TELE0033/document.
Full textRecent technological advances have given rise to the popularity and success of cloud. This new paradigm is gaining an expanding interest, since it provides cost efficient architectures that support the transmission, storage, and intensive computing of data. However, these promising storage services bring many challenging design issues, considerably due to the loss of data control. These challenges, namely data confidentiality and data integrity, have significant influence on the security and performances of the cloud system. This thesis aims at overcoming this trade-off, while considering two data security concerns. On one hand, we focus on data confidentiality preservation which becomes more complex with flexible data sharing among a dynamic group of users. It requires the secrecy of outsourced data and an efficient sharing of decrypting keys between different authorized users. For this purpose, we, first, proposed a new method relying on the use of ID-Based Cryptography (IBC), where each client acts as a Private Key Generator (PKG). That is, he generates his own public elements and derives his corresponding private key using a secret. Thanks to IBC properties, this contribution is shown to support data privacy and confidentiality, and to be resistant to unauthorized access to data during the sharing process, while considering two realistic threat models, namely an honest but curious server and a malicious user adversary. Second, we define CloudaSec, a public key based solution, which proposes the separation of subscription-based key management and confidentiality-oriented asymmetric encryption policies. That is, CloudaSec enables flexible and scalable deployment of the solution as well as strong security guarantees for outsourced data in cloud servers. Experimental results, under OpenStack Swift, have proven the efficiency of CloudaSec in scalable data sharing, while considering the impact of the cryptographic operations at the client side. On the other hand, we address the Proof of Data Possession (PDP) concern. In fact, the cloud customer should have an efficient way to perform periodical remote integrity verifications, without keeping the data locally, following three substantial aspects : security level, public verifiability, and performance. This concern is magnified by the client’s constrained storage and computation capabilities and the large size of outsourced data. In order to fulfill this security requirement, we first define a new zero-knowledge PDP proto- col that provides deterministic integrity verification guarantees, relying on the uniqueness of the Euclidean Division. These guarantees are considered as interesting, compared to several proposed schemes, presenting probabilistic approaches. Then, we propose SHoPS, a Set-Homomorphic Proof of Data Possession scheme, supporting the 3 levels of data verification. SHoPS enables the cloud client not only to obtain a proof of possession from the remote server, but also to verify that a given data file is distributed across multiple storage devices to achieve a certain desired level of fault tolerance. Indeed, we present the set homomorphism property, which extends malleability to set operations properties, such as union, intersection and inclusion. SHoPS presents high security level and low processing complexity. For instance, SHoPS saves energy within the cloud provider by distributing the computation over multiple nodes. Each node provides proofs of local data block sets. This is to make applicable, a resulting proof over sets of data blocks, satisfying several needs, such as, proofs aggregation
Kaaniche, Nesrine. "Cloud data storage security based on cryptographic mechanisms." Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2014. http://www.theses.fr/2014TELE0033.
Full textRecent technological advances have given rise to the popularity and success of cloud. This new paradigm is gaining an expanding interest, since it provides cost efficient architectures that support the transmission, storage, and intensive computing of data. However, these promising storage services bring many challenging design issues, considerably due to the loss of data control. These challenges, namely data confidentiality and data integrity, have significant influence on the security and performances of the cloud system. This thesis aims at overcoming this trade-off, while considering two data security concerns. On one hand, we focus on data confidentiality preservation which becomes more complex with flexible data sharing among a dynamic group of users. It requires the secrecy of outsourced data and an efficient sharing of decrypting keys between different authorized users. For this purpose, we, first, proposed a new method relying on the use of ID-Based Cryptography (IBC), where each client acts as a Private Key Generator (PKG). That is, he generates his own public elements and derives his corresponding private key using a secret. Thanks to IBC properties, this contribution is shown to support data privacy and confidentiality, and to be resistant to unauthorized access to data during the sharing process, while considering two realistic threat models, namely an honest but curious server and a malicious user adversary. Second, we define CloudaSec, a public key based solution, which proposes the separation of subscription-based key management and confidentiality-oriented asymmetric encryption policies. That is, CloudaSec enables flexible and scalable deployment of the solution as well as strong security guarantees for outsourced data in cloud servers. Experimental results, under OpenStack Swift, have proven the efficiency of CloudaSec in scalable data sharing, while considering the impact of the cryptographic operations at the client side. On the other hand, we address the Proof of Data Possession (PDP) concern. In fact, the cloud customer should have an efficient way to perform periodical remote integrity verifications, without keeping the data locally, following three substantial aspects : security level, public verifiability, and performance. This concern is magnified by the client’s constrained storage and computation capabilities and the large size of outsourced data. In order to fulfill this security requirement, we first define a new zero-knowledge PDP proto- col that provides deterministic integrity verification guarantees, relying on the uniqueness of the Euclidean Division. These guarantees are considered as interesting, compared to several proposed schemes, presenting probabilistic approaches. Then, we propose SHoPS, a Set-Homomorphic Proof of Data Possession scheme, supporting the 3 levels of data verification. SHoPS enables the cloud client not only to obtain a proof of possession from the remote server, but also to verify that a given data file is distributed across multiple storage devices to achieve a certain desired level of fault tolerance. Indeed, we present the set homomorphism property, which extends malleability to set operations properties, such as union, intersection and inclusion. SHoPS presents high security level and low processing complexity. For instance, SHoPS saves energy within the cloud provider by distributing the computation over multiple nodes. Each node provides proofs of local data block sets. This is to make applicable, a resulting proof over sets of data blocks, satisfying several needs, such as, proofs aggregation
Ikken, Sonia. "Efficient placement design and storage cost saving for big data workflow in cloud datacenters." Thesis, Evry, Institut national des télécommunications, 2017. http://www.theses.fr/2017TELE0020/document.
Full textThe typical cloud big data systems are the workflow-based including MapReduce which has emerged as the paradigm of choice for developing large scale data intensive applications. Data generated by such systems are huge, valuable and stored at multiple geographical locations for reuse. Indeed, workflow systems, composed of jobs using collaborative task-based models, present new dependency and intermediate data exchange needs. This gives rise to new issues when selecting distributed data and storage resources so that the execution of tasks or job is on time, and resource usage-cost-efficient. Furthermore, the performance of the tasks processing is governed by the efficiency of the intermediate data management. In this thesis we tackle the problem of intermediate data management in cloud multi-datacenters by considering the requirements of the workflow applications generating them. For this aim, we design and develop models and algorithms for big data placement problem in the underlying geo-distributed cloud infrastructure so that the data management cost of these applications is minimized. The first addressed problem is the study of the intermediate data access behavior of tasks running in MapReduce-Hadoop cluster. Our approach develops and explores Markov model that uses spatial locality of intermediate data blocks and analyzes spill file sequentiality through a prediction algorithm. Secondly, this thesis deals with storage cost minimization of intermediate data placement in federated cloud storage. Through a federation mechanism, we propose an exact ILP algorithm to assist multiple cloud datacenters hosting the generated intermediate data dependencies of pair of files. The proposed algorithm takes into account scientific user requirements, data dependency and data size. Finally, a more generic problem is addressed in this thesis that involve two variants of the placement problem: splittable and unsplittable intermediate data dependencies. The main goal is to minimize the operational data cost according to inter and intra-job dependencies
Carpen-Amarie, Alexandra. "BlobSeer as a data-storage facility for clouds : self-Adaptation, integration, evaluation." Thesis, Cachan, Ecole normale supérieure, 2011. http://www.theses.fr/2011DENS0066/document.
Full textThe emergence of Cloud computing brings forward many challenges that may limit the adoption rate of the Cloud paradigm. As data volumes processed by Cloud applications increase exponentially, designing efficient and secure solutions for data management emerges as a crucial requirement. The goal of this thesis is to enhance a distributed data-management system with self-management capabilities, so that it can meet the requirements of the Cloud storage services in terms of scalability, data availability, reliability and security. Furthermore, we aim at building a Cloud data service both compatible with state-of-the-art Cloud interfaces and able to deliver high-throughput data storage. To meet these goals, we proposed generic self-awareness, self-protection and self-configuration components targeted at distributed data-management systems. We validated them on top of BlobSeer, a large-scale data-management system designed to optimize highly-concurrent data accesses. Next, we devised and implemented a BlobSeer-based file system optimized to efficiently serve as a storage backend for Cloud services. We then integrated it within a real-world Cloud environment, the Nimbus platform. The benefits and drawbacks of using Cloud storage for real-life applications have been emphasized in evaluations that involved data-intensive MapReduce applications and tightly-coupled, high-performance computing applications
Božić, Nikola. "Blockchain technologies and their application to secure virtualized infrastructure control." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS596.
Full textBlockchain is a technology making the shared registry concept from distributed systems a reality for a number of application domains, from the cryptocurrency one to potentially any industrial system requiring decentralized, robust, trusted and automated decision making in a multi-stakeholder situation. Nevertheless, the actual advantages in using blockchain instead of any other traditional solution (such as centralized databases) are not completely understood to date, or at least there is a strong need for a vademecum guiding designers toward the right decision about when to adopt blockchain or not, which kind of blockchain better meets use-case requirements, and how to use it. At first, we aim at providing the community with such a vademecum, while giving a general presentation of blockchain that goes beyond its usage in Bitcoin and surveying a selection of the vast literature that emerged in the last few years. We draw the key requirements and their evolution when passing from permissionless to permissioned blockchains, presenting the differences between proposed and experimented consensus mechanisms, and describing existing blockchain platforms. Furthermore, we present the B-VMOA blockchain to secure virtual machine orchestration operations for cloud computing and network functions virtualization systems applying the proposed vademecum logic. Using tutorial examples, we describe our design choices and draw implementation plans. We further develop the vademecum logic applied to cloud orchestration and how it can lead to precise platform specifications. We capture the key system operations and complex interactions between them. We focus on the last release of Hyperledger Fabric platform as a way to develop B-VMOA system. Besides, Hyperledger Fabric optimizes conceived B-VMOA network performance, security, and scalability by way of workload separation across: (i) transaction execution and validation peers, and (ii) transaction ordering nodes. We study and use a distributed execute-order-validate architecture which differentiates our conceived B-VMOA system from legacy distributed systems that follow a traditional state-machine replication architecture. We parameterize and validate our model with data collected from a realistic testbed, presenting an empirical study to characterize system performance and identify potential performance bottlenecks. Furthermore, we present the tools we used, the network setup and the discussion on empirical observations from the data collection. We examine the impact of various configurable parameters to conduct an in-dept study of core components and benchmark performance for common usage patterns. Namely, B-VMOA is meant to be run within data center. Different data center interconnection topologies scale differently due to communication protocols. Enormous challenges appear to efficiently design the network interconnections so that the deployment and maintenance of the infrastructure is cost-effective. We analyze the structural properties of several DCN topologies and also present some comparison among these network architectures with the aim to reduce B-VMOA overhead costs. From our analysis, we recommend the hypercube topology as a solution to address the performance bottleneck in the B-VMOA control plane caused by gossip dissemination protocol along with an estimate of performance improvement
Moise, Diana Maria. "Optimizing data management for MapReduce applications on large-scale distributed infrastructures." Thesis, Cachan, Ecole normale supérieure, 2011. http://www.theses.fr/2011DENS0067/document.
Full textData-intensive applications are nowadays, widely used in various domains to extract and process information, to design complex systems, to perform simulations of real models, etc. These applications exhibit challenging requirements in terms of both storage and computation. Specialized abstractions like Google’s MapReduce were developed to efficiently manage the workloads of data-intensive applications. The MapReduce abstraction has revolutionized the data-intensive community and has rapidly spread to various research and production areas. An open-source implementation of Google's abstraction was provided by Yahoo! through the Hadoop project. This framework is considered the reference MapReduce implementation and is currently heavily used for various purposes and on several infrastructures. To achieve high-performance MapReduce processing, we propose a concurrency-optimized file system for MapReduce Frameworks. As a starting point, we rely on BlobSeer, a framework that was designed as a solution to the challenge of efficiently storing data generated by data-intensive applications running at large scales. We have built the BlobSeer File System (BSFS), with the goal of providing high throughput under heavy concurrency to MapReduce applications. We also study several aspects related to intermediate data management in MapReduce frameworks. We investigate the requirements of MapReduce intermediate data at two levels: inside the same job, and during the execution of pipeline applications. Finally, we show how BSFS can enable extensions to the de facto MapReduce implementation, Hadoop, such as the support for the append operation. This work also comprises the evaluation and the obtained results in the context of grid and cloud environments