Dissertations / Theses: 'Data privacy'

1

Zhang, Nan. "Privacy-preserving data mining." [College Station, Tex. : Texas A&M University, 2006. http://hdl.handle.net/1969.1/ETD-TAMU-1080.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Nguyen, Benjamin. "Privacy-Centric Data Management." Habilitation à diriger des recherches, Université de Versailles-Saint Quentin en Yvelines, 2013. http://tel.archives-ouvertes.fr/tel-00936130.

Full text

Abstract:

This document will focus on my core computer science research since 2010, covering the topic of data management and privacy. More speci cally, I will present the following topics : - A new paradigm, called Trusted Cells for privacy-centric personal data management based on the Asymmetric Architecture composed of trusted or open (low power) distributed hardware devices acting as personal data servers and a highly powerful, highly available supporting server, such as a cloud. (Chapter 2). - Adapting aggregate data computation techniques to the Trusted Cells environment, with the example of Privacy-Preserving Data Publishing (Chapter 3). - Minimizing the data that leaves a Trusted Cell, i.e. enforcing the general privacy principle of Limited Data Collection (Chapter 4). This document contains only results that have already been published. As such, rather than focus on the details and technicalities of each result, I have tried to provide an easy way to have a global understanding of the context behind the work, explain the problematic of the work, and give a summary of the main scienti c results and impact.

APA, Harvard, Vancouver, ISO, and other styles

3

Lin, Zhenmin. "Privacy Preserving Distributed Data Mining." UKnowledge, 2012. http://uknowledge.uky.edu/cs_etds/9.

Full text

Abstract:

Privacy preserving distributed data mining aims to design secure protocols which allow multiple parties to conduct collaborative data mining while protecting the data privacy. My research focuses on the design and implementation of privacy preserving two-party protocols based on homomorphic encryption. I present new results in this area, including new secure protocols for basic operations and two fundamental privacy preserving data mining protocols. I propose a number of secure protocols for basic operations in the additive secret-sharing scheme based on homomorphic encryption. I derive a basic relationship between a secret number and its shares, with which we develop efficient secure comparison and secure division with public divisor protocols. I also design a secure inverse square root protocol based on Newton's iterative method and hence propose a solution for the secure square root problem. In addition, we propose a secure exponential protocol based on Taylor series expansions. All these protocols are implemented using secure multiplication and can be used to develop privacy preserving distributed data mining protocols. In particular, I develop efficient privacy preserving protocols for two fundamental data mining tasks: multiple linear regression and EM clustering. Both protocols work for arbitrarily partitioned datasets. The two-party privacy preserving linear regression protocol is provably secure in the semi-honest model, and the EM clustering protocol discloses only the number of iterations. I provide a proof-of-concept implementation of these protocols in C++, based on the Paillier cryptosystem.

APA, Harvard, Vancouver, ISO, and other styles

4

Aron, Yotam. "Information privacy for linked data." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/85215.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 77-79).
As data mining over massive amounts of linked data becomes more and more prevalent in research applications, information privacy becomes a more important issue. This is especially true in the biological and medical fields, where information sensitivity is high. Previous experience has shown that simple anonymization techniques, such as removing an individual's name from a data set, are inadequate to fully protect the data's participants. While strong privacy guarantees have been studied for relational databases, these are virtually non-existent for graph-structured linked data. This line of research is important, however, since the aggregation of data across different web sources may lead to privacy leaks. The ontological structure of linked data especially aids these attacks on privacy. The purpose of this thesis is two-fold. The first is to investigate differential privacy, a strong privacy guarantee, and how to construct differentially-private mechanisms for linked data. The second involves the design and implementation of the SPARQL Privacy Insurance Module (SPIM). Using a combination of well-studied techniques, such as authentication and access control, and the mechanisms developed to maintain differential privacy over linked data, it attempts to limit privacy hazards for SPARQL queries. By using these privacy-preservation techniques, data owners may be more willing to share their data sets with other researchers without the fear that it will be misused. Consequently, we can expect greater sharing of information, which will foster collaboration and improve the types of data that researchers can have access to.
by Yotam Aron.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

5

Jawad, Mohamed. "Data privacy in P2P Systems." Nantes, 2011. http://www.theses.fr/2011NANT2020.

Full text

Abstract:

Les communautés en ligne pair-a-pair (P2P), comme les communautés professionnelles (p. Ex. , médicales ou de recherche) deviennent de plus en plus populaires a cause de l’augmentation des besoins du partage de données. Alors que les environnements P2P offrent des caractéristiques intéressantes (p. Ex. , passage a l’échelle, disponibilité, dynamicité), leurs garanties en termes de protection des données sensibles sont limitées. Ils peuvent être considérés comme hostiles car les données publiées peuvent être consultées par tous les pairs (potentiellement malicieux) et utilisées pour tout (p. Ex. , pour le commerce illicite ou tout simplement pour des activités contre les préférences personnelles ou éthiques du propriétaire des données). Cette thèse propose un service qui permet le partage de données sensibles dans les systèmes P2P, tout en assurant leur confidentialité. La première contribution est l’analyse des techniques existant pour la confidentialité de données dans les architectures P2P. La deuxième contribution est un modèle de confidentialité, nomme PriMod, qui permet aux propriétaires de données de spécifier leurs préférences de confidentialité dans de politiques de confidentialité et d’attacher ces politiques a leurs données sensibles. La troisième contribution est le développement de PriServ, un service de confidentialité, base sur une DHT qui met en oeuvre PriMod afin de prévenir la violation de la confidentialité de données. Entre autres, PriServ utilise de techniques de confiance pour prédire le comportement des pairs
Online peer-to-peer (P2P) communities such as professional ones (e. G. , medical or research communities) are becoming popular due to increasing needs on data sharing. P2P environments offer valuable characteristics but limited guarantees when sharing sensitive data. They can be considered as hostile because data can be accessed by everyone (by potentially malicious peers) and used for everything (e. G. , for marketing or for activities against the owner’s preferences or ethics). This thesis proposes a privacy service that allows sharing sensitive data in P2P systems while protecting their privacy. The first contribution consists on analyzing existing techniques for data privacy in P2P architectures. The second contribution is a privacy model for P2P systems named PriMod which allows data owners to specify their privacy preferences in privacy policies and to associate them with their data. The third contribution is the development of PriServ, a privacy service located on top of DHT-based P2P systems which implements PriMod to prevent data privacy violations. Among others, PriServ uses trust techniques to predict peers behavior

APA, Harvard, Vancouver, ISO, and other styles

6

Foresti, S. "Preserving privacy in data outsourcing." Doctoral thesis, Università degli Studi di Milano, 2010. http://hdl.handle.net/2434/156360.

Full text

Abstract:

Privacy requirements have an increasing impact on the realization of modern applications. Commercial and legal regulations demand that privacy guarantees be provided whenever sensitive information is stored, processed, or communicated to external parties. Current approaches encrypt sensitive data, thus reducing query execution efficiency and preventing selective information release. In this thesis, we present a comprehensive approach for protecting highly sensitive information when it is stored on systems that are not under the data owner's control. Our approach combines access control and encryption, enforcing access control via structured encryption. Our solution, coupled with efficient algorithms for key derivation and distribution, provides efficient and secure authorization management on outsourced data allowing the data owner to outsource not only the data but the security policy itself. To reduce the amount of data to be encrypted we also investigate data fragmentation as a possible way to protect privacy of data associations and provide fragmentation as a complementary means for protecting privacy: associations broken by fragmentation will be visible only to users authorized (by knowing the proper key) to join fragments. We finally investigate the problem of executing queries over possible data distributed at different servers and which must be controlled to ensure sensitive information and sensitive associations be visible only to parties authorized for that.

APA, Harvard, Vancouver, ISO, and other styles

7

Livraga, G. "PRESERVING PRIVACY IN DATA RELEASE." Doctoral thesis, Università degli Studi di Milano, 2014. http://hdl.handle.net/2434/233324.

Full text

Abstract:

Data sharing and dissemination play a key role in our information society. Not only do they prove to be advantageous to the involved parties, but they can also be fruitful to the society at large (e.g., new treatments for rare diseases can be discovered based on real clinical trials shared by hospitals and pharmaceutical companies). The advancements in the Information and Communication Technology (ICT) make the process of releasing a data collection simpler than ever. The availability of novel computing paradigms, such as data outsourcing and cloud computing, make scalable, reliable and fast infrastructures a dream come true at reasonable costs. As a natural consequence of this scenario, data owners often rely on external storage servers for releasing their data collections, thus delegating the burden of data storage and management to the service provider. Unfortunately, the price to be paid when releasing a collection of data is in terms of unprecedented privacy risks. Data collections often include sensitive information, not intended for disclosure, that should be properly protected. The problem of protecting privacy in data release has been under the attention of the research and development communities for a long time. However, the richness of released data, the large number of available sources, and the emerging outsourcing/cloud scenarios raise novel problems, not addressed by traditional approaches, which need enhanced solutions. In this thesis, we define a comprehensive approach for protecting sensitive information when large collections of data are publicly or selectively released by their owners. In a nutshell, this requires protecting data explicitly included in the release, as well as protecting information not explicitly released but that could be exposed by the release, and ensuring that access to released data be allowed only to authorized parties according to the data owners’ policies. More specifically, these three aspects translate to three requirements, addressed by this thesis, which can be summarized as follows. The first requirement is the protection of data explicitly included in a release. While intuitive, this requirement is complicated by the fact that privacy-enhancing techniques should not prevent recipients from performing legitimate analysis on the released data but, on the contrary, should ensure sufficient visibility over non sensitive information. We therefore propose a solution, based on a novel formulation of the fragmentation approach, that vertically fragments a data collection so to satisfy requirements for both information protection and visibility, and we complement it with an effective means for enriching the utility of the released data. The second requirement is the protection of data not explicitly included in a release. As a matter of fact, even a collection of non sensitive data might enable recipients to infer (possibly sensitive) information not explicitly disclosed but that somehow depends on the released information (e.g., the release of the treatment with which a patient is being cared can leak information about her disease). To address this requirement, starting from a real case study, we propose a solution for counteracting the inference of sensitive information that can be drawn observing peculiar value distributions in the released data collection. The third requirement is access control enforcement. Available solutions fall short for a variety of reasons. Traditional access control mechanisms are based on a reference monitor and do not fit outsourcing/cloud scenarios, since neither the data owner is willing, nor the cloud storage server is trusted, to enforce the access control policy. Recent solutions for access control enforcement in outsourcing scenarios assume outsourced data to be read-only and cannot easily manage (dynamic) write authorizations. We therefore propose an approach for efficiently supporting grant and revoke of write authorizations, building upon the selective encryption approach, and we also define a subscription-based authorization policy, to fit real-world scenarios where users pay for a service and access the resources made available during their subscriptions. The main contributions of this thesis can therefore be summarized as follows. With respect to the protection of data explicitly included in a release, our original results are: i) a novel modeling of the fragmentation problem; ii) an efficient technique for computing a fragmentation, based on reduced Ordered Binary Decision Diagrams (OBDDs) to formulate the conditions that a fragmentation must satisfy; iii) the computation of a minimal fragmentation not fragmenting data more than necessary, with the definition of both an exact and an heuristic algorithms, which provides faster computational time while well approximating the exact solutions; and iv) the definition of loose associations, a sanitized form of the sensitive associations broken by fragmentation that can be safely released, specifically extended to operate on arbitrary fragmentations. With respect to the protection of data not explicitly included in a release, our original results are: i) the definition of a novel and unresolved inference scenario, raised from a real case study where data items are incrementally released upon request; ii) the definition of several metrics to assess the inference exposure due to a data release, based upon the concepts of mutual information, Kullback-Leibler distance between distributions, Pearson’s cumulative statistic, and Dixon’s coefficient; and iii) the identification of a safe release with respect to the considered inference channel and the definition of the controls to be enforced to guarantee that no sensitive information be leaked releasing non sensitive data items. With respect to access control enforcement, our original results are: i) the management of dynamic write authorizations, by defining a solution based on selective encryption for efficiently and effectively supporting grant and revoke of write authorizations; ii) the definition of an effective technique to guarantee data integrity, so to allow the data owner and the users to verify that modifications to a resource have been produced only by authorized users; and iii) the modeling and enforcement of a subscription-based authorization policy, to support scenarios where both the set of users and the set of resources change frequently over time, and users’ authorizations are based on their subscriptions.

APA, Harvard, Vancouver, ISO, and other styles

8

Loukides, Grigorios. "Data utility and privacy protection in data publishing." Thesis, Cardiff University, 2008. http://orca.cf.ac.uk/54743/.

Full text

Abstract:

Data about individuals is being increasingly collected and disseminated for purposes such as business analysis and medical research. This has raised some privacy concerns. In response, a number of techniques have been proposed which attempt to transform data prior to its release so that sensitive information about the individuals contained within it is protected. A:-Anonymisation is one such technique that has attracted much recent attention from the database research community. A:-Anonymisation works by transforming data in such a way that each record is made identical to at least A: 1 other records with respect to those attributes that are likely to be used to identify individuals. This helps prevent sensitive information associated with individuals from being disclosed, as each individual is represented by at least A: records in the dataset. Ideally, a /c-anonymised dataset should maximise both data utility and privacy protection, i.e. it should allow intended data analytic tasks to be carried out without loss of accuracy while preventing sensitive information disclosure, but these two notions are conflicting and only a trade-off between them can be achieved in practice. The existing works, however, focus on how either utility or protection requirement may be satisfied, which often result in anonymised data with an unnecessarily and/or unacceptably low level of utility or protection. In this thesis, we study how to construct /-anonymous data that satisfies both data utility and privacy protection requirements. We propose new criteria to capture utility and protection requirements, and new algorithms that allow A:-anonymisations with required utility/protection trade-off or guarantees to be generated. Our extensive experiments using both benchmarking and synthetic datasets show that our methods are efficient, can produce A:-anonymised data with desired properties, and outperform the state of the art methods in retaining data utility and providing privacy protection.

APA, Harvard, Vancouver, ISO, and other styles

9

Sobati, Moghadam Somayeh. "Contributions to Data Privacy in Cloud Data Warehouses." Thesis, Lyon, 2017. http://www.theses.fr/2017LYSE2020.

Full text

Abstract:

Actuellement, les scénarios d’externalisation de données deviennent de plus en plus courants avec l’avènement de l’infonuagique. L’infonuagique attire les entreprises et les organisations en raison d’une grande variété d’avantages fonctionnels et économiques.De plus, l’infonuagique offre une haute disponibilité, le passage d’échelle et une reprise après panne efficace. L’un des services plus notables est la base de données en tant que service (Database-as-a-Service), où les particuliers et les organisations externalisent les données, le stockage et la gestion `a un fournisseur de services. Ces services permettent de stocker un entrepôt de données chez un fournisseur distant et d’exécuter des analysesen ligne (OLAP).Bien que l’infonuagique offre de nombreux avantages, elle induit aussi des problèmes de s´sécurité et de confidentialité. La solution usuelle pour garantir la confidentialité des données consiste à chiffrer les données localement avant de les envoyer à un serveur externe. Les systèmes de gestion de base de données sécurisés utilisent diverses méthodes de cryptage, mais ils induisent un surcoût considérable de calcul et de stockage ou révèlent des informations sur les données.Dans cette thèse, nous proposons une nouvelle méthode de chiffrement (S4) inspirée du partage secret de Shamir. S4 est un système homomorphique additif : des additions peuvent être directement calculées sur les données cryptées. S4 trait les points faibles des systèmes existants en réduisant les coûts tout en maintenant un niveau raisonnable de confidentialité. S4 est efficace en termes de stockage et de calcul, ce qui est adéquat pour les scénarios d’externalisation de données qui considèrent que l’utilisateur dispose de ressources de calcul et de stockage limitées. Nos résultats expérimentaux confirment l’efficacité de S4 en termes de surcoût de calcul et de stockage par rapport aux solutions existantes.Nous proposons également de nouveaux schémas d’indexation qui préservent l’ordre des données, OPI et waOPI. Nous nous concentrons sur le problème de l’exécution des requêtes exacts et d’intervalle sur des données chiffrées. Contrairement aux solutions existantes, nos systèmes empêchent toute analyse statistique par un adversaire. Tout en assurant la confidentialité des données, les schémas proposés présentent de bonnes performances et entraînent un changement minimal dans les logiciels existants
Nowadays, data outsourcing scenarios are ever more common with the advent of cloud computing. Cloud computing appeals businesses and organizations because of a wide variety of benefits such as cost savings and service benefits. Moreover, cloud computing provides higher availability, scalability, and more effective disaster recovery rather than in-house operations. One of the most notable cloud outsourcing services is database outsourcing (Database-as-a-Service), where individuals and organizations outsource data storage and management to a Cloud Service Provider (CSP). Naturally, such services allow storing a data warehouse (DW) on a remote, untrusted CSP and running on-line analytical processing (OLAP).Although cloud data outsourcing induces many benefits, it also brings out security and in particular privacy concerns. A typical solution to preserve data privacy is encrypting data locally before sending them to an external server. Secure database management systems use various encryption schemes, but they either induce computational and storage overhead or reveal some information about data, which jeopardizes privacy.In this thesis, we propose a new secure secret splitting scheme (S4) inspired by Shamir’s secret sharing. S4 implements an additive homomorphic scheme, i.e., additions can be directly computed over encrypted data. S4 addresses the shortcomings of existing approaches by reducing storage and computational overhead while still enforcing a reasonable level of privacy. S4 is efficient both in terms of storage and computing, which is ideal for data outsourcing scenarios that consider the user has limited computation and storage resources. Experimental results confirm the efficiency of S4 in terms of computation and storage overhead with respect to existing solutions.Moreover, we also present new order-preserving schemes, order-preserving indexing (OPI) and wrap-around order-preserving indexing (waOPI), which are practical on cloud outsourced DWs. We focus on the problem of performing range and exact match queries over encrypted data. In contrast to existing solutions, our schemes prevent performing statistical and frequency analysis by an adversary. While providing data privacy, the proposed schemes bear good performance and lead to minimal change for existing software

APA, Harvard, Vancouver, ISO, and other styles

10

Ma, Jianjie. "Learning from perturbed data for privacy-preserving data mining." Online access for everyone, 2006. http://www.dissertations.wsu.edu/Dissertations/Summer2006/j%5Fma%5F080406.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Swapna, B., and R. VijayaPrakash. "Privacy Preserving Data Mining Operations without Disrupting Data Quality." International Journal of Computer Science and Network (IJCSN), 2012. http://hdl.handle.net/10150/271473.

Full text

Abstract:

Data mining operations have become prevalent as they can extract trends or patterns that help in taking good business decisions. Often they operate on large historical databases or data warehouses to obtain actionable knowledge or business intelligence that helps in taking well informed decisions. In the data mining domain there came many tools to perform data mining operations. These tools are best used to obtain actionable knowledge from data. Manually doing this is not possible as the data is very huge and takes lot of time. Thus the data mining domain is being improved in a rapid pace. While data mining operations are very useful in obtaining business intelligence, they also have some drawbacks that are they get sensitive information from the database. People may misuse the freedom given by obtaining sensitive information illegally. Preserving privacy of data is also important. Towards this end many Privacy Preserving Data Mining (PPDM) algorithms came into existence that sanitize data to prevent data mining algorithms from extracting sensitive information from the databases.
Data mining operations help discover business intelligence from historical data. The extracted business intelligence or actionable knowledge helps in taking well informed decisions that leads to profit to the organization that makes use of it. While performing mining privacy of data has to be given utmost importance. To achieve this PPDM (Privacy Preserving Data Mining) came into existence by sanitizing database that prevents discovery of association rules. However, this leads to modification of data and thus disrupting the quality of data. This paper proposes a new technique and algorithms that can perform privacy preserving data mining operations while ensuring that the data quality is not lost. The empirical results revealed that the proposed technique is useful and can be used in real world applications.

APA, Harvard, Vancouver, ISO, and other styles

12

Heerde, Harold Johann Wilhelm van. "Privacy-aware data management by means of data degradation." Versailles-St Quentin en Yvelines, 2010. http://www.theses.fr/2010VERS0031.

Full text

Abstract:

Les fournisseurs de services recueillent de plus en plus d'informations personnelles sensibles, bien qu’il soit réputé comme très difficile de protéger efficacement ces informations contre le pira-tage, la fuite d’information par négligence, le contournement de chartes de confidentialité peu précises, et les usages abusifs d’administrateurs de données peu scrupuleux. Dans cette thèse, nous conjecturons qu’une rétention sans limite de données sensibles dans une base de données mènera inévitablement à une divulgation non autorisée de ces données. Limiter dans le temps la rétention d'informations sensibles réduit la quantité de données emmagasinées et donc l'impact d'une telle divulgation. La première contribution de cette thèse porte sur la proposition d’un mo-dèle particulier de rétention basé sur une dégradation progressive et irréversible de données sensibles. Effacer les données d'une base de données est une tâche difficile à mettre en œuvre techniquement; la dégradation de données a en effet un impact sur les structures de stockage, l'indexation, la gestion de transactions et les mécanismes de journalisation. Pour permettre une dégradation irréversible des données, nous proposons plusieurs techniques telles que le stockage des don-nées ordonnées par le temps de dégradation et l'utilisation de techniques ad-hoc de chiffrement. Les techniques proposées sont validées par une analyse théorique ainsi que par l’implémentation d’un prototype
Service-providers collect more and more privacy-sensitive information, even though it is hard to protect this information against hackers, abuse of weak privacy policies, negligence, and malicious database administrators. In this thesis, we take the position that endless retention of privacy-sensitive information will inevitably lead to unauthorized data disclosure. Limiting the retention of privacy-sensitive information limits the amount of stored data and therefore the impact of such a disclosure. Removing data from a database system is not a straightforward task; data degradation has an impact on the storage structure, indexing, transaction management, and logging mechanisms. To show the feasibility of data degradation, we provide several techniques to implement it; mainly, a combination of keeping data sorted on degradation time and using encryption techniques where possible. The techniques are founded with a prototype implementation and a theoretical analysis

APA, Harvard, Vancouver, ISO, and other styles

13

Bonatti, Piero A., Bert Bos, Stefan Decker, Garcia Javier David Fernandez, Sabrina Kirrane, Vassilios Peristeras, Axel Polleres, and Rigo Wenning. "Data Privacy Vocabularies and Controls: Semantic Web for Transparency and Privacy." CEUR Workshop Proceedings, 2018. http://epub.wu.ac.at/6490/1/SW4SG_2018.pdf.

Full text

Abstract:

Managing Privacy and understanding the handling of personal data has turned into a fundamental right¿at least for Europeans since May 25th with the coming into force of the General Data Protection Regulation. Yet, whereas many different tools by different vendors promise companies to guarantee their compliance to GDPR in terms of consent management and keeping track of the personal data they handle in their processes, interoperability between such tools as well uniform user facing interfaces will be needed to enable true transparency, user-configurable and -manageable privacy policies and data portability (as also¿implicitly¿promised by GDPR). We argue that such interoperability can be enabled by agreed upon vocabularies and Linked Data.

APA, Harvard, Vancouver, ISO, and other styles

14

Ataei, Mehrnaz. "Location data privacy : principles to practice." Doctoral thesis, Universitat Jaume I, 2018. http://hdl.handle.net/10803/666740.

Full text

Abstract:

A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Geographic Information Systems
Location data is essential to the provision of relevant and tailored information in location-based services (LBS) but has the potential to reveal sensitive information about users. Unwanted disclosure of location data is associated with various threats known as dataveillance which can lead to risks like loss of control, (continuous) monitoring, identification, and social profiling. Striking a balance between providing a service based on the user’s location while protecting their (location) privacy is thus a key challenge in this area. Although many solutions have been developed to mitigate the data privacy-related threats, the aspects involving users (i.e. User Interfaces (UI)) and the way in which location data management can affects (location) data privacy have not received much attention in the literature. This thesis develops and evaluates approaches to facilitate the design and development of privacy-aware LBS. This work has explicitly focused on three areas: location data management in LBS, the design of UI for LBS, and compliance with (location) data privacy regulation. To address location data management, this thesis proposes modifications to LBS architectures and introduces the concept of temporal and spatial ephemerality as an alternative way to manage location privacy. The modifications include adding two components to the LBS architecture: one component dedicated to the management of decisions regarding collected location data such as applying restriction on the time that the service provider stores the data; and one component for adjusting location data privacy settings for the users of LBS. This thesis then develops a set of UI controls for fine-grained management of location privacy settings based on privacy theory (Westin), privacy by design principles and general UI design principles. Finally, this thesis brings forth a set of guidelines for the design and development of privacy-aware LBS through the analysis of the General Data Protection Regulation (GDPR) and expert recommendations. Service providers, designers, and developers of LBS can benefit from the contributions of this work as the proposed architecture and UI model can help them to recognise and address privacy issues during the LBS development process. The developed guidelines, on the other hand, can be helpful when developers and designers face difficulties understanding (location) data privacy-related regulations. The guidelines include both a list of legal requirements derived from GDPR’s text and expert suggestions for developers and designers of LBS in the process of complying with data privacy regulation.

APA, Harvard, Vancouver, ISO, and other styles

15

Sivakumar, Anusha. "Enhancing Privacy Of Data Through Anonymization." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-177349.

Full text

Abstract:

A steep rise in availability of personal data has resulted in endless opportunities for data scientists who utilize this open data for research. However, such easy availability of complex personal data challenges privacy of individuals represented in the data. To protect privacy, traditional methods such as using pseudonyms or blurring identity of individuals are followed before releasing data. These traditional methods alone are not sufficient to enhance privacy because combining released data with other publicly available data or background knowledge identifies individuals. A potential solution to this privacy loss problem is to anonymize data so that it cannot be linked to individuals represented in the data. In case of researches involving personal data, anonymization becomes more important than ever. If we alter data to preserve privacy of research participants, the resultant data becomes almost useless for many researches. Therefore, preserving privacy of individuals represented in the data and minimizing data loss caused by privacy preservation is very vital. In this project, we first study the different cases in which attacks take place, different forms of attacks and existing solutions to prevent the attacks. After carefully examining the literature and the undertaken problem, we propose a solution to preserve privacy of research participants as much as possible and to make data useful to the researchers. To support our solution, we consider the case of Digital Footprints which collects and publishes Facebook data with the consent of the users.
En kraftig ökning av tillgång på personligt relaterat data, har lett till oändliga möjligheter för dataforskare att utnyttja dessa data för forskning. En konsekvens är att det blir svårt att bevara personers integritet på grund av den enorma mängd uppgifter som är tillgängliga. För att skydda den personliga integriteten finns möjligheten att med traditionella metoder använda pseudonymer och alias, innan personen publicerar personligt data. Att enbart använda dessa traditionella metoder är inte tillräckligt för att skydda privatlivet, det finns alltid möjligheter att koppla data till verkliga individer. En potentiell lösning på detta problem är att använda anonymiseringstekniker, för att förändra data om individen på att anpassat sätt och på det viset försvåra att data sammankopplas med en individ. Vid undersökningar som innehåller personuppgifter blir anonymisering allt viktigare. Om vi försöker att ändra uppgifter för att bevara integriteten av forskningsdeltagare innan data publiceras, blir den resulterande uppgifter nästan oanvändbar för många undersökningar. För att bevara integriteten av individer representerade i underlaget och att minimera dataförlust orsakad av privatlivet bevarande är mycket viktigt. I denna avhandling har vi studerat de olika fall där attackerna kan ske, olika former av attacker och befintliga lösningar för att förhindra attackerna. Efter att noggrant granskat litteraturen och problemet, föreslår vi en teoretisk lösning för att bevara integriteten av forskningsdeltagarna så mycket som möjligt och att uppgifterna ska vara till nytta för forskning. Som stöd för vår lösning, gällande digitala fotspår som lagrar Facebook uppgifter med samtycke av användarna och släpper den lagrade informationen via olika användargränssnitt.

APA, Harvard, Vancouver, ISO, and other styles

16

Sang, Lin. "Social Big Data and Privacy Awareness." Thesis, Uppsala universitet, Institutionen för informatik och media, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-242444.

Full text

Abstract:

Based on the rapid development of Big Data, the data from the online social network becomea major part of it. Big data make the social networks became data-oriented rather than social-oriented. Taking this into account, this dissertation presents a qualitative study to research howdoes the data-oriented social network aﬀect its users’ privacy management for nowadays. Within this dissertation, an overview of Big Data and privacy issues on the social network waspresented as a background study. We adapted the communication privacy theory as a frameworkfor further analysis how individuals manage their privacy on social networks. We study socialnetworks as an entirety in this dissertation. We selected Facebook as a case study to present theconnection between social network, Big Data and privacy issues. The data that supported the result of this dissertation collected by the face-to-face and in-depthinterview study. As consequence, we found that the people divided the social networks intodiﬀerent level of openness in order to avoid the privacy invasions and violations, according totheir privacy concern. They reduced and transferred their sharing from an open social networkto a more close one. However, the risk of privacy problems actually raised because peopleneglected to understand the data process on social networks. They focused on managed theeveryday sharing but too easily allowed other application accessed their personal data on thesocial network (such like the Facebook proﬁle).

APA, Harvard, Vancouver, ISO, and other styles

17

Lazarovich, Amir. "Invisible Ink : blockchain for data privacy." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/98626.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 81-85).
The problem of maintaining complete control over and transparency with regard to our digital identity is growing more urgent as our lives become more dependent on online and digital services. What once was rightfully ours and under our control is now spread among uncountable entities across many locations. We have built a platform that securely distributes encrypted user-sensitive data. It uses the Bitcoin blockchain to keep a trust-less audit trail for data interactions and to manage access to user data. Our platform offers advantages to both users and service providers. The user enjoys the heightened transparency, control, and security of their personal data, while the service provider becomes much less vulnerable to single point-of failures and breaches, which in turn decreases their exposure to information-security liability, thereby saving them money and protecting their brand. Our work extends an idea developed by the author and two collaborators, a peer-to- peer network that uses blockchain technology and off-blockchain storage to securely distribute sensitive data in a decentralized manner using a custom blockchain protocol. Our two main contributions are: 1. developing this platform and 2. analyzing its feasibility in real-world applications. This includes designing a protocol for data authentication that runs on an Internet scale peer-to-peer network, abstracting complex interactions with encrypted data, building a dashboard for data auditing and management, as well as building servers and sample services that use this platform for testing and evaluation. This work has been supported by the MIT Communication Futures Program and the Digital Life Consortium.
by Amir Lazarovich.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

18

DeYoung, Mark E. "Privacy Preserving Network Security Data Analytics." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/82909.

Full text

Abstract:

The problem of revealing accurate statistics about a population while maintaining privacy of individuals is extensively studied in several related disciplines. Statisticians, information security experts, and computational theory researchers, to name a few, have produced extensive bodies of work regarding privacy preservation. Still the need to improve our ability to control the dissemination of potentially private information is driven home by an incessant rhythm of data breaches, data leaks, and privacy exposure. History has shown that both public and private sector organizations are not immune to loss of control over data due to lax handling, incidental leakage, or adversarial breaches. Prudent organizations should consider the sensitive nature of network security data and network operations performance data recorded as logged events. These logged events often contain data elements that are directly correlated with sensitive information about people and their activities -- often at the same level of detail as sensor data. Privacy preserving data publication has the potential to support reproducibility and exploration of new analytic techniques for network security. Providing sanitized data sets de-couples privacy protection efforts from analytic research. De-coupling privacy protections from analytical capabilities enables specialists to tease out the information and knowledge hidden in high dimensional data, while, at the same time, providing some degree of assurance that people's private information is not exposed unnecessarily. In this research we propose methods that support a risk based approach to privacy preserving data publication for network security data. Our main research objective is the design and implementation of technical methods to support the appropriate release of network security data so it can be utilized to develop new analytic methods in an ethical manner. Our intent is to produce a database which holds network security data representative of a contextualized network and people's interaction with the network mid-points and end-points without the problems of identifiability.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

19

Shang, Hui. "Privacy Preserving Kin Genomic Data Publishing." Miami University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=miami1594835227299524.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Lin, Zehua. "Privacy Preserving Social Network Data Publishing." Miami University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=miami1610045108271476.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Gonçalves, João Miguel Ribeiro. "Context-awareness privacy in data communications." Doctoral thesis, Universidade de Aveiro, 2015. http://hdl.handle.net/10773/15760.

Full text

Abstract:

Doutoramento em Informática
Internet users consume online targeted advertising based on information collected about them and voluntarily share personal information in social networks. Sensor information and data from smart-phones is collected and used by applications, sometimes in unclear ways. As it happens today with smartphones, in the near future sensors will be shipped in all types of connected devices, enabling ubiquitous information gathering from the physical environment, enabling the vision of Ambient Intelligence. The value of gathered data, if not obvious, can be harnessed through data mining techniques and put to use by enabling personalized and tailored services as well as business intelligence practices, fueling the digital economy. However, the ever-expanding information gathering and use undermines the privacy conceptions of the past. Natural social practices of managing privacy in daily relations are overridden by socially-awkward communication tools, service providers struggle with security issues resulting in harmful data leaks, governments use mass surveillance techniques, the incentives of the digital economy threaten consumer privacy, and the advancement of consumergrade data-gathering technology enables new inter-personal abuses. A wide range of fields attempts to address technology-related privacy problems, however they vary immensely in terms of assumptions, scope and approach. Privacy of future use cases is typically handled vertically, instead of building upon previous work that can be re-contextualized, while current privacy problems are typically addressed per type in a more focused way. Because significant effort was required to make sense of the relations and structure of privacy-related work, this thesis attempts to transmit a structured view of it. It is multi-disciplinary - from cryptography to economics, including distributed systems and information theory - and addresses privacy issues of different natures. As existing work is framed and discussed, the contributions to the state-of-theart done in the scope of this thesis are presented. The contributions add to five distinct areas: 1) identity in distributed systems; 2) future context-aware services; 3) event-based context management; 4) low-latency information flow control; 5) high-dimensional dataset anonymity. Finally, having laid out such landscape of the privacy-preserving work, the current and future privacy challenges are discussed, considering not only technical but also socio-economic perspectives.
Quem usa a Internet vê publicidade direccionada com base nos seus hábitos de navegação, e provavelmente partilha voluntariamente informação pessoal em redes sociais. A informação disponível nos novos telemóveis é amplamente acedida e utilizada por aplicações móveis, por vezes sem razões claras para isso. Tal como acontece hoje com os telemóveis, no futuro muitos tipos de dispositivos elecónicos incluirão sensores que permitirão captar dados do ambiente, possibilitando o surgimento de ambientes inteligentes. O valor dos dados captados, se não for óbvio, pode ser derivado através de técnicas de análise de dados e usado para fornecer serviços personalizados e definir estratégias de negócio, fomentando a economia digital. No entanto estas práticas de recolha de informação criam novas questões de privacidade. As práticas naturais de relações inter-pessoais são dificultadas por novos meios de comunicação que não as contemplam, os problemas de segurança de informação sucedem-se, os estados vigiam os seus cidadãos, a economia digital leva á monitorização dos consumidores, e as capacidades de captação e gravação dos novos dispositivos eletrónicos podem ser usadas abusivamente pelos próprios utilizadores contra outras pessoas. Um grande número de áreas científicas focam problemas de privacidade relacionados com tecnologia, no entanto fazem-no de maneiras diferentes e assumindo pontos de partida distintos. A privacidade de novos cenários é tipicamente tratada verticalmente, em vez de re-contextualizar trabalho existente, enquanto os problemas actuais são tratados de uma forma mais focada. Devido a este fraccionamento no trabalho existente, um exercício muito relevante foi a sua estruturação no âmbito desta tese. O trabalho identificado é multi-disciplinar - da criptografia à economia, incluindo sistemas distribuídos e teoria da informação - e trata de problemas de privacidade de naturezas diferentes. À medida que o trabalho existente é apresentado, as contribuições feitas por esta tese são discutidas. Estas enquadram-se em cinco áreas distintas: 1) identidade em sistemas distribuídos; 2) serviços contextualizados; 3) gestão orientada a eventos de informação de contexto; 4) controlo de fluxo de informação com latência baixa; 5) bases de dados de recomendação anónimas. Tendo descrito o trabalho existente em privacidade, os desafios actuais e futuros da privacidade são discutidos considerando também perspectivas socio-económicas.

APA, Harvard, Vancouver, ISO, and other styles

22

Thomas, Dilys. "Algorithms and architectures for data privacy /." May be available electronically:, 2007. http://proquest.umi.com/login?COPT=REJTPTU1MTUmSU5UPTAmVkVSPTI=&clientId=12498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Smith, Tanshanika Turner. "Examining Data Privacy Breaches in Healthcare." ScholarWorks, 2016. https://scholarworks.waldenu.edu/dissertations/2623.

Full text

Abstract:

Healthcare data can contain sensitive, personal, and confidential information that should remain secure. Despite the efforts to protect patient data, security breaches occur and may result in fraud, identity theft, and other damages. Grounded in the theoretical backdrop of integrated system theory, the purpose of this study was to determine the association between data privacy breaches, data storage locations, business associates, covered entities, and number of individuals affected. Study data consisted of secondary breach information retrieved from the Department of Health and Human Services Office of Civil Rights. Loglinear analytical procedures were used to examine U.S. healthcare breach incidents and to derive a 4-way loglinear model. Loglinear analysis procedures included in the model yielded a significance value of 0.000, p > .05 for the both the likelihood ratio and Pearson chi-square statistics indicating that an association among the variables existed. Results showed that over 70% of breaches involve healthcare providers and revealed that security incidents often consist of electronic or other digital information. Findings revealed that threats are evolving and showed that likely factors other than data loss and theft contribute to security events, unwanted exposure, and breach incidents. Research results may impact social change by providing security professionals with a broader understanding of data breaches required to design and implement more secure and effective information security prevention programs. Healthcare leaders might affect social change by utilizing findings to further the security dialogue needed to minimize security risk factors, protect sensitive healthcare data, and reduce breach mitigation and incident response costs.

APA, Harvard, Vancouver, ISO, and other styles

24

de, Souza Tulio. "Data-level privacy through data perturbation in distributed multi-application environments." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:2b818039-bde4-41d6-96ca-0367704a53f0.

Full text

Abstract:

Wireless sensor networks used to have a main role as a monitoring tool for environmental purposes and animal tracking. This spectrum of applications, however, has dramatically grown in the past few years. Such evolution means that what used to be application-specific networks are now multi application environments, often with federation capabilities. This shift results in a challenging environment for data privacy, mainly caused by the broadening of the spectrum of data access points and involved entities. This thesis first evaluates existing privacy preserving data aggregation techniques to determine how suitable they are for providing data privacy in this more elaborate environment. Such evaluation led to the design of the set difference attack, which explores the fact that they all rely purely on data aggregation to achieve privacy, which is shown through simulation not to be suitable to the task. It also indicates that some form of uncertainty is required in order to mitigate the attack. Another relevant finding is that the attack can also be effective against standalone networks, by exploring the node availability factor. Uncertainty is achieved via the use of differential privacy, which offers a strong and formal privacy guarantee through data perturbation. In order to make it suitable to work in a wireless sensor network environment, which mainly deals with time-series data, two new approaches to address it have been proposed. These have a contrasting effect when it comes to utility and privacy levels, offering a flexible balance between privacy and data utility for sensed entities and data analysts/consumers. Lastly, this thesis proposes a framework to assist in the design of privacy preserving data aggregation protocols to suit application needs while at the same time complying with desired privacy requirements. The framework's evaluation compares and contrasts several scenarios to demonstrate the level of flexibility and effectiveness that the designed protocols can provide. Overall, this thesis demonstrates that data perturbation can be made significantly practical through the proposed framework. Although some problems remain, with further improvements to data correlation methods and better use of some intrinsic characteristics of such networks, the use of data perturbation may become a practical and efficient privacy preserving mechanism for wireless sensor networks.

APA, Harvard, Vancouver, ISO, and other styles

25

Zheng, Yao. "Privacy Preservation for Cloud-Based Data Sharing and Data Analytics." Diss., Virginia Tech, 2016. http://hdl.handle.net/10919/73796.

Full text

Abstract:

Data privacy is a globally recognized human right for individuals to control the access to their personal information, and bar the negative consequences from the use of this information. As communication technologies progress, the means to protect data privacy must also evolve to address new challenges come into view. Our research goal in this dissertation is to develop privacy protection frameworks and techniques suitable for the emerging cloud-based data services, in particular privacy-preserving algorithms and protocols for the cloud-based data sharing and data analytics services. Cloud computing has enabled users to store, process, and communicate their personal information through third-party services. It has also raised privacy issues regarding losing control over data, mass harvesting of information, and un-consented disclosure of personal content. Above all, the main concern is the lack of understanding about data privacy in cloud environments. Currently, the cloud service providers either advocate the principle of third-party doctrine and deny users' rights to protect their data stored in the cloud; or rely the notice-and-choice framework and present users with ambiguous, incomprehensible privacy statements without any meaningful privacy guarantee. In this regard, our research has three main contributions. First, to capture users' privacy expectations in cloud environments, we conceptually divide personal data into two categories, i.e., visible data and invisible data. The visible data refer to information users intentionally create, upload to, and share through the cloud; the invisible data refer to users' information retained in the cloud that is aggregated, analyzed, and repurposed without their knowledge or understanding. Second, to address users' privacy concerns raised by cloud computing, we propose two privacy protection frameworks, namely individual control and use limitation. The individual control framework emphasizes users' capability to govern the access to the visible data stored in the cloud. The use limitation framework emphasizes users' expectation to remain anonymous when the invisible data are aggregated and analyzed by cloud-based data services. Finally, we investigate various techniques to accommodate the new privacy protection frameworks, in the context of four cloud-based data services: personal health record sharing, location-based proximity test, link recommendation for social networks, and face tagging in photo management applications. For the first case, we develop a key-based protection technique to enforce fine-grained access control to users' digital health records. For the second case, we develop a key-less protection technique to achieve location-specific user selection. For latter two cases, we develop distributed learning algorithms to prevent large scale data harvesting. We further combine these algorithms with query regulation techniques to achieve user anonymity. The picture that is emerging from the above works is a bleak one. Regarding to personal data, the reality is we can no longer control them all. As communication technologies evolve, the scope of personal data has expanded beyond local, discrete silos, and integrated into the Internet. The traditional understanding of privacy must be updated to reflect these changes. In addition, because privacy is a particularly nuanced problem that is governed by context, there is no one-size-fit-all solution. While some cases can be salvaged either by cryptography or by other means, in others a rethinking of the trade-offs between utility and privacy appears to be necessary.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

26

Liyanaarachchi, Gajendra P. "Data Privacy Considerations for Redesigning Organizational Strategy." Thesis, Griffith University, 2022. http://hdl.handle.net/10072/417671.

Full text

Abstract:

The thesis consists of eight interrelated chapters. The introduction chapter provides an overview of the thesis topic, summarizes the gaps in the existing knowledge, and describes the research's purpose, context, and design. The second chapter includes paper 1, which focuses on identifying the rationale and antecedents for the personal privacy concerns that lead to the Privacy paradox. The third chapter comprises paper 2, which focuses on personal privacy concerns to segment the customers and design sales strategies in online banking. The fourth chapter includes paper 3, which focuses on personal privacy concerns and organizational experience on data privacy to design a business strategy. The fifth chapter provides paper 4, which integrates consumer insight on past, present, and future anticipation to identify different privacy paradox situations. The various privacy paradox situations provide the foundation for creating a strategic competitive advantage mainly due to access to individual privacy concerns. The sixth chapter includes paper 5, which emphasizes identifying consumer vulnerability due to privacy risk as a CSR initiative focusing on a broader ecosystem beyond the commercial interest. The Seventh chapter includes paper 6, a systematic literature review that provides a comprehensive definition and determines the future direction of online privacy. The purpose is to balance consumer vulnerability and organizational strategy to create a platform driven by a mutual interest in privacy protection and growth. The seventh chapter discusses the contribution and impact of this research study on privacy paradox literature. Figure 1 demonstrates the outline of the thesis and the integration of the publications of the study.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
Dept of Marketing
Griffith Business School
Full Text

APA, Harvard, Vancouver, ISO, and other styles

27

Fernandez, Garcia Javier D., Fajar J. Ekaputra, Peb Ruswono Aryan, Amr Azzam, and Elmar Kiesling. "Privacy-aware Linked Widgets." ACM Press, 2019. http://epub.wu.ac.at/6859/1/Privacy_aware_Linked_Data_Widgets___WWW_19__Camera_Ready.pdf.

Full text

Abstract:

The European General Data Protection Regulation (GDPR) brings new challenges for companies, who must demonstrate that their systems and business processes comply with usage constraints specified by data subjects. However, due to the lack of standards, tools, and best practices, many organizations struggle to adapt their infrastructure and processes to ensure and demonstrate that all data processing is in compliance with users' given consent. The SPECIAL EU H2020 project has developed vocabularies that can formally describe data subjects' given consent as well as methods that use this description to automatically determine whether processing of the data according to a given policy is compliant with the given consent. Whereas this makes it possible to determine whether processing was compliant or not, integration of the approach into existing line of business applications and ex-ante compliance checking remains an open challenge. In this short paper, we demonstrate how the SPECIAL consent and compliance framework can be integrated into Linked Widgets, a mashup platform, in order to support privacy-aware ad-hoc integration of personal data. The resulting environment makes it possible to create data integration and processing workflows out of components that inherently respect usage policies of the data that is being processed and are able to demonstrate compliance. We provide an overview of the necessary meta data and orchestration towards a privacy-aware linked data mashup platform that automatically respects subjects' given consents. The evaluation results show the potential of our approach for ex-ante usage policy compliance checking within the Linked Widgets Platforms and beyond.

APA, Harvard, Vancouver, ISO, and other styles

28

Joines, Amy. "Impact of private data mining on personal privacy from agents of government." [Ames, Iowa : Iowa State University], 2009.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

29

Stouppa, Phiniki. "Deciding Data Privacy for ALC Knowledge Bases /." [S.l.] : [s.n.], 2009. http://www.ub.unibe.ch/content/bibliotheken_sammlungen/sondersammlungen/dissen_bestellformular/index_ger.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Balla, Stefano. "Privacy-Preserving Data Mining: un approccio verticale." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amslaurea.unibo.it/17517/.

Full text

Abstract:

La crescente disponibilità dei dati, oltre a portare grandi benefici, ha portato alla luce diversi rischi dipendenti dell'esposizione di informazioni confidenziali. Inoltre il metodo di estrazione di informazioni detto Data Mining ha posto un ulteriore problema dando la possibilità di estrarre informazioni sensibili. Questa tesi tratta il tema della privacy sotto tre livelli principali: Privacy-Preserving Data Providing, Privacy-Preserving Data Collecting e Privacy-Preserving Data Mining. Questi livelli rappresentano la divisione del ciclo dei dati in tre tappe in cui i dati prima vengono forniti, poi vengono immagazzinati ed infine sottoposti all'applicazione del data mining. Per ciascun livello vengono proposte tecnologie, tecniche e approcci alla conservazione della privacy dei dati con l'obbiettivo di ridurre al minimo la perdita di informazioni.

APA, Harvard, Vancouver, ISO, and other styles

31

Chen, Xiaoqiang. "Privacy Preserving Data Publishing for Recommender System." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-155785.

Full text

Abstract:

Driven by mutual benefits, exchange and publication of data among various parties is an inevitable trend. However, released data often contains sensitive information thus direct publication violates individual privacy. This undertaking is in the scope of privacy preserving data publishing (PPDP). Among many privacy models, K- anonymity framework is popular and well-studied, it protects data by constructing groups of anonymous records such that each record in the table released is covered by no fewer than k-1 other records. This thesis investigates different privacy models and focus on achieving k-anonymity for large scale and sparse databases, especially recommender systems. We present a general process for anonymization of large scale database. A preprocessing phase strategically extracts preference matrix from original data by Singular Value Decomposition (SVD) eliminates the high dimensionality and sparsity problem. A new clustering based k-anonymity heuristic named Bisecting K-Gather (BKG) is invented and proved to be efficient and accurate. To support customized user privacy assignments, we also proposed a new concept called customized k-anonymity along with a corresponding algorithm. Experiments on MovieLens database are assessed and also presented. The results show we can release anonymized data with low compromising privacy.

APA, Harvard, Vancouver, ISO, and other styles

32

Scheffler, Thomas. "Privacy enforcement with data owner-defined policies." Phd thesis, Universität Potsdam, 2013. http://opus.kobv.de/ubp/volltexte/2013/6793/.

Full text

Abstract:

This thesis proposes a privacy protection framework for the controlled distribution and use of personal private data. The framework is based on the idea that privacy policies can be set directly by the data owner and can be automatically enforced against the data user. Data privacy continues to be a very important topic, as our dependency on electronic communication maintains its current growth, and private data is shared between multiple devices, users and locations. The growing amount and the ubiquitous availability of personal private data increases the likelihood of data misuse. Early privacy protection techniques, such as anonymous email and payment systems have focused on data avoidance and anonymous use of services. They did not take into account that data sharing cannot be avoided when people participate in electronic communication scenarios that involve social interactions. This leads to a situation where data is shared widely and uncontrollably and in most cases the data owner has no control over further distribution and use of personal private data. Previous efforts to integrate privacy awareness into data processing workflows have focused on the extension of existing access control frameworks with privacy aware functions or have analysed specific individual problems such as the expressiveness of policy languages. So far, very few implementations of integrated privacy protection mechanisms exist and can be studied to prove their effectiveness for privacy protection. Second level issues that stem from practical application of the implemented mechanisms, such as usability, life-time data management and changes in trustworthiness have received very little attention so far, mainly because they require actual implementations to be studied. Most existing privacy protection schemes silently assume that it is the privilege of the data user to define the contract under which personal private data is released. Such an approach simplifies policy management and policy enforcement for the data user, but leaves the data owner with a binary decision to submit or withhold his or her personal data based on the provided policy. We wanted to empower the data owner to express his or her privacy preferences through privacy policies that follow the so-called Owner-Retained Access Control (ORAC) model. ORAC has been proposed by McCollum, et al. as an alternate access control mechanism that leaves the authority over access decisions by the originator of the data. The data owner is given control over the release policy for his or her personal data, and he or she can set permissions or restrictions according to individually perceived trust values. Such a policy needs to be expressed in a coherent way and must allow the deterministic policy evaluation by different entities. The privacy policy also needs to be communicated from the data owner to the data user, so that it can be enforced. Data and policy are stored together as a Protected Data Object that follows the Sticky Policy paradigm as defined by Mont, et al. and others. We developed a unique policy combination approach that takes usability aspects for the creation and maintenance of policies into consideration. Our privacy policy consists of three parts: A Default Policy provides basic privacy protection if no specific rules have been entered by the data owner. An Owner Policy part allows the customisation of the default policy by the data owner. And a so-called Safety Policy guarantees that the data owner cannot specify disadvantageous policies, which, for example, exclude him or her from further access to the private data. The combined evaluation of these three policy-parts yields the necessary access decision. The automatic enforcement of privacy policies in our protection framework is supported by a reference monitor implementation. We started our work with the development of a client-side protection mechanism that allows the enforcement of data-use restrictions after private data has been released to the data user. The client-side enforcement component for data-use policies is based on a modified Java Security Framework. Privacy policies are translated into corresponding Java permissions that can be automatically enforced by the Java Security Manager. When we later extended our work to implement server-side protection mechanisms, we found several drawbacks for the privacy enforcement through the Java Security Framework. We solved this problem by extending our reference monitor design to use Aspect-Oriented Programming (AOP) and the Java Reflection API to intercept data accesses in existing applications and provide a way to enforce data owner-defined privacy policies for business applications.
Im Rahmen der Dissertation wurde ein Framework für die Durchsetzung von Richtlinien zum Schutz privater Daten geschaffen, welches darauf setzt, dass diese Richtlinien oder Policies direkt von den Eigentümern der Daten erstellt werden und automatisiert durchsetzbar sind. Der Schutz privater Daten ist ein sehr wichtiges Thema im Bereich der elektronischen Kommunikation, welches durch die fortschreitende Gerätevernetzung und die Verfügbarkeit und Nutzung privater Daten in Onlinediensten noch an Bedeutung gewinnt. In der Vergangenheit wurden verschiedene Techniken für den Schutz privater Daten entwickelt: so genannte Privacy Enhancing Technologies. Viele dieser Technologien arbeiten nach dem Prinzip der Datensparsamkeit und der Anonymisierung und stehen damit der modernen Netznutzung in Sozialen Medien entgegen. Das führt zu der Situation, dass private Daten umfassend verteilt und genutzt werden, ohne dass der Datenbesitzer gezielte Kontrolle über die Verteilung und Nutzung seiner privaten Daten ausüben kann. Existierende richtlinienbasiert Datenschutztechniken gehen in der Regel davon aus, dass der Nutzer und nicht der Eigentümer der Daten die Richtlinien für den Umgang mit privaten Daten vorgibt. Dieser Ansatz vereinfacht das Management und die Durchsetzung der Zugriffsbeschränkungen für den Datennutzer, lässt dem Datenbesitzer aber nur die Alternative den Richtlinien des Datennutzers zuzustimmen, oder keine Daten weiterzugeben. Es war daher unser Ansatz die Interessen des Datenbesitzers durch die Möglichkeit der Formulierung eigener Richtlinien zu stärken. Das dabei verwendete Modell zur Zugriffskontrolle wird auch als Owner-Retained Access Control (ORAC) bezeichnet und wurde 1990 von McCollum u.a. formuliert. Das Grundprinzip dieses Modells besteht darin, dass die Autorität über Zugriffsentscheidungen stets beim Urheber der Daten verbleibt. Aus diesem Ansatz ergeben sich zwei Herausforderungen. Zum einen muss der Besitzer der Daten, der Data Owner, in die Lage versetzt werden, aussagekräftige und korrekte Richtlinien für den Umgang mit seinen Daten formulieren zu können. Da es sich dabei um normale Computernutzer handelt, muss davon ausgegangen werden, dass diese Personen auch Fehler bei der Richtlinienerstellung machen. Wir haben dieses Problem dadurch gelöst, dass wir die Datenschutzrichtlinien in drei separate Bereiche mit unterschiedlicher Priorität aufteilen. Der Bereich mit der niedrigsten Priorität definiert grundlegende Schutzeigenschaften. Der Dateneigentümer kann diese Eigenschaften durch eigene Regeln mittlerer Priorität überschrieben. Darüber hinaus sorgt ein Bereich mit Sicherheitsrichtlinien hoher Priorität dafür, dass bestimmte Zugriffsrechte immer gewahrt bleiben. Die zweite Herausforderung besteht in der gezielten Kommunikation der Richtlinien und deren Durchsetzung gegenüber dem Datennutzer (auch als Data User bezeichnet). Um die Richtlinien dem Datennutzer bekannt zu machen, verwenden wir so genannte Sticky Policies. Das bedeutet, dass wir die Richtlinien über eine geeignete Kodierung an die zu schützenden Daten anhängen, so dass jederzeit darauf Bezug genommen werden kann und auch bei der Verteilung der Daten die Datenschutzanforderungen der Besitzer erhalten bleiben. Für die Durchsetzung der Richtlinien auf dem System des Datennutzers haben wir zwei verschiedene Ansätze entwickelt. Wir haben einen so genannten Reference Monitor entwickelt, welcher jeglichen Zugriff auf die privaten Daten kontrolliert und anhand der in der Sticky Policy gespeicherten Regeln entscheidet, ob der Datennutzer den Zugriff auf diese Daten erhält oder nicht. Dieser Reference Monitor wurde zum einen als Client-seitigen Lösung implementiert, die auf dem Sicherheitskonzept der Programmiersprache Java aufsetzt. Zum anderen wurde auch eine Lösung für Server entwickelt, welche mit Hilfe der Aspekt-orientierten Programmierung den Zugriff auf bestimmte Methoden eines Programms kontrollieren kann. In dem Client-seitigen Referenzmonitor werden Privacy Policies in Java Permissions übersetzt und automatisiert durch den Java Security Manager gegenüber beliebigen Applikationen durchgesetzt. Da dieser Ansatz beim Zugriff auf Daten mit anderer Privacy Policy den Neustart der Applikation erfordert, wurde für den Server-seitigen Referenzmonitor ein anderer Ansatz gewählt. Mit Hilfe der Java Reflection API und Methoden der Aspektorientierten Programmierung gelang es Datenzugriffe in existierenden Applikationen abzufangen und erst nach Prüfung der Datenschutzrichtlinie den Zugriff zuzulassen oder zu verbieten. Beide Lösungen wurden auf ihre Leistungsfähigkeit getestet und stellen eine Erweiterung der bisher bekannten Techniken zum Schutz privater Daten dar.

APA, Harvard, Vancouver, ISO, and other styles

33

Sweatt, Brian M. (Brian Michael). "A privacy-preserving personal sensor data ecosystem." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/91875.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 79-82).
Despite the ubiquity of passively-collected sensor data (primarily attained via smartphones), there does not currently exist a comprehensive system for authorizing the collection of such data, collecting, storing, analyzing, and visualizing it in a manner that preserves the privacy of the user generating the data. This thesis shows the design and implementation of such a system, named openPDS, from both the client and server perspectives. Two server-side components are implemented: a centralized registry server for authentication and authorization of all entities in the system, and a distributed Personal Data Store that allows analysis to be run against the stored sensor data and aggregated across multiple Personal Data Stores in a privacy-preserving fashion. The client, implemented for the Android mobile phone operating system, makes use of the Funf Open Sensing framework to collect data and adds the ability for users to authenticate against the registry server, authorize third-party applications to analyze data once it reaches their Personal Data Store, and finally, visualize the result of such analysis within a mobile phone or web browser. A number of example quantified-self and social applications are built on top of this framework to demonstrate feasibility of the system from both development and user perspectives.
by Brian M. Sweatt.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

34

Paradesi, Sharon M. (Sharon Myrtle) 1986. "User-controlled privacy for personal mobile data." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/93839.

Full text

Abstract:

Thesis: Elec. E. in Computer Science, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 81-82).
Smartphones collect a wide range of sensor data, ranging from the basic, such as location, accelerometer, and Bluetooth, to the more advanced, such as heart rate. Mobile apps on the Android and iOS platforms provide users with "all-or-nothing" controls during installation to get permission for data collection and use. Users have to either agree to have the app collect and use all the requested data or not use the app at all. This is slowly changing with the iOS framework, which now allows users to turn off location sharing with specific apps even after installation. MIT Living Lab platform is a mobile app development platform that uses openPDS to provide MIT users with personal data stores but currently lacks user controls for privacy. This thesis presents PrivacyMate, a suite of tools for MIT Living Labs that provide user-controllable privacy mechanisms for mobile apps. PrivacyMate aims to enable users to maintain better control over their mobile personal data. It extends the model of iOS and allows users to select or deselect various types of data (more than just location information) for collection and use by apps. Users can also provide temporal and spatial specifications to indicate a context in which they are comfortable sharing their data with certain apps. We incorporate the privacy mechanisms offered by PrivacyMate into two mobile apps built on the MIT Living Lab platform: ScheduleME and MIT-FIT. ScheduleME enables users to schedule meetings without disclosing either their locations or points of interest. MIT-FIT enables users to track personal and aggregate high-activity regions and times, as well as view personalized fitness-related event recommendations. The MIT Living Lab team is planning to eventually deploy PrivacyMate and MIT-FIT to the entire MIT community.
by Sharon Myrtle Paradesi.
Elec. E. in Computer Science

APA, Harvard, Vancouver, ISO, and other styles

35

Simmons, Sean Kenneth. "Preserving patient privacy in biomedical data analysis." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/101821.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2015.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 147-154).
The growing number of large biomedical databases and electronic health records promise to be an invaluable resource for biomedical researchers. Recent work, however, has shown that sharing this data- even when aggregated to produce p-values, regression coefficients, count queries, and minor allele frequencies (MAFs)- may compromise patient privacy. This raises a fundamental question: how do we protect patient privacy while still making the most out of their data? In this thesis, we develop various methods to perform privacy preserving analysis on biomedical data, with an eye towards genomic data. We begin by introducing a model based measure, PrivMAF, that allows us to decide when it is safe to release MAFs. We modify this measure to deal with perturbed data, and show that we are able to achieve privacy guarantees while adding less noise (and thus preserving more useful information) than previous methods. We also consider using differentially private methods to preserve patient privacy. Motivated by cohort selection in medical studies, we develop an improved method for releasing differentially private medical count queries. We then turn our eyes towards differentially private genome wide association studies (GWAS). We improve the runtime and utility of various privacy preserving methods for genome analysis, bringing these methods much closer to real world applicability. Building off this result, we develop differentially private versions of more powerful statistics based off linear mixed models.
by Sean Kenneth Simmons.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

36

Mittal, Nupur. "Data, learning and privacy in recommendation systems." Thesis, Rennes 1, 2016. http://www.theses.fr/2016REN1S084/document.

Full text

Abstract:

Les systèmes de recommandation sont devenus une partie indispensable des services et des applications d’internet, en particulier dû à la surcharge de données provenant de nombreuses sources. Quel que soit le type, chaque système de recommandation a des défis fondamentaux à traiter. Dans ce travail, nous identifions trois défis communs, rencontrés par tous les types de systèmes de recommandation: les données, les modèles d'apprentissage et la protection de la vie privée. Nous élaborons différents problèmes qui peuvent être créés par des données inappropriées en mettant l'accent sur sa qualité et sa quantité. De plus, nous mettons en évidence l'importance des réseaux sociaux dans la mise à disposition publique de systèmes de recommandation contenant des données sur ses utilisateurs, afin d'améliorer la qualité des recommandations. Nous fournissons également les capacités d'inférence de données publiques liées à des données relatives aux utilisateurs. Dans notre travail, nous exploitons cette capacité à améliorer la qualité des recommandations, mais nous soutenons également qu'il en résulte des menaces d'atteinte à la vie privée des utilisateurs sur la base de leurs informations. Pour notre second défi, nous proposons une nouvelle version de la méthode des k plus proches voisins (knn, de l'anglais k-nearest neighbors), qui est une des méthodes d'apprentissage parmi les plus populaires pour les systèmes de recommandation. Notre solution, conçue pour exploiter la nature bipartie des ensembles de données utilisateur-élément, est évolutive, rapide et efficace pour la construction d'un graphe knn et tire sa motivation de la grande quantité de ressources utilisées par des calculs de similarité dans les calculs de knn. Notre algorithme KIFF utilise des expériences sur des jeux de données réelles provenant de divers domaines, pour démontrer sa rapidité et son efficacité lorsqu'il est comparé à des approches issues de l'état de l'art. Pour notre dernière contribution, nous fournissons un mécanisme permettant aux utilisateurs de dissimuler leur opinion sur des réseaux sociaux sans pour autant dissimuler leur identité
Recommendation systems have gained tremendous popularity, both in academia and industry. They have evolved into many different varieties depending mostly on the techniques and ideas used in their implementation. This categorization also marks the boundary of their application domain. Regardless of the types of recommendation systems, they are complex and multi-disciplinary in nature, involving subjects like information retrieval, data cleansing and preprocessing, data mining etc. In our work, we identify three different challenges (among many possible) involved in the process of making recommendations and provide their solutions. We elaborate the challenges involved in obtaining user-demographic data, and processing it, to render it useful for making recommendations. The focus here is to make use of Online Social Networks to access publicly available user data, to help the recommendation systems. Using user-demographic data for the purpose of improving the personalized recommendations, has many other advantages, like dealing with the famous cold-start problem. It is also one of the founding pillars of hybrid recommendation systems. With the help of this work, we underline the importance of user’s publicly available information like tweets, posts, votes etc. to infer more private details about her. As the second challenge, we aim at improving the learning process of recommendation systems. Our goal is to provide a k-nearest neighbor method that deals with very large amount of datasets, surpassing billions of users. We propose a generic, fast and scalable k-NN graph construction algorithm that improves significantly the performance as compared to the state-of-the art approaches. Our idea is based on leveraging the bipartite nature of the underlying dataset, and use a preprocessing phase to reduce the number of similarity computations in later iterations. As a result, we gain a speed-up of 14 compared to other significant approaches from literature. Finally, we also consider the issue of privacy. Instead of directly viewing it under trivial recommendation systems, we analyze it on Online Social Networks. First, we reason how OSNs can be seen as a form of recommendation systems and how information dissemination is similar to broadcasting opinion/reviews in trivial recommendation systems. Following this parallelism, we identify privacy threat in information diffusion in OSNs and provide a privacy preserving algorithm for the same. Our algorithm Riposte quantifies the privacy in terms of differential privacy and with the help of experimental datasets, we demonstrate how Riposte maintains the desirable information diffusion properties of a network

APA, Harvard, Vancouver, ISO, and other styles

37

Keerthi, Thomas. "Distilling mobile privacy requirements from qualitative data." Thesis, Open University, 2014. http://oro.open.ac.uk/40121/.

Full text

Abstract:

As mobile computing applications have become commonplace, it is increasingly important for them to address end-users' privacy requirements. Mobile privacy requirements depend on a number of contextual socio-cultural factors to which mobility adds another level of contextual variation. However, traditional requirements elicitation methods do not sufficiently account for contextual factors and therefore cannot be used effectively to represent and analyse the privacy requirements of mobile end users. On the other hand, methods that investigate contextual factors tend to produce data which can be difficult to use for requirements modelling. To address this problem, we have developed a Distillation approach that employs a problem analysis model to extract and refine privacy requirements for mobile applications from raw data gathered through empirical studies involving real users. Our aim was to enable the extraction of mobile privacy requirements that account for relevant contextual factors while contributing to the software design and implementation process. A key feature of the distillation approach is a problem structuring framework called privacy facets (PriF). The facets in the PriF framework support the identification of privacy requirements from different contextual perspectives namely - actors, information, information-flows and places. The PriF framework also aids in uncovering privacy determinants and threats that a system must take into account in order to support the end-user's privacy. In this work, we first show the working of distillation using qualitative data taken from an empirical study which involved social-networking practices of mobile users. As a means of validating distillation, another distinctly separate qualitative dataset from a location-tracking study is used, in both cases, the empirical studies relate to privacy issues faced by real users observed in their mobile environment.

APA, Harvard, Vancouver, ISO, and other styles

38

Ainslie, Mandi. "Big data and privacy : a modernised framework." Diss., University of Pretoria, 2017. http://hdl.handle.net/2263/59805.

Full text

Abstract:

Like the revolutions that preceded it, the Fourth Industrial Revolution has the potential to raise global income levels and improve the quality of life for populations around the world. Responding to global challenges, generating efficiencies, prediction improvement, democratisation access to information and empowering individuals are a few examples of the economic and social value created by personal information. However, this technological innovation, efficiency and productivity comes at a price -?privacy. As a result, individuals are growingly concerned that companies and governments are not protecting data about them and that they are instead using it in ways not necessarily in their best interests. The objective of this research is to investigate the validity and feasibility of a Personal Data Store (PDS) against the developed a priori framework. Ten qualitative, semi-?structured interviews using the long interview method were conducted with individuals identified as a subject matter expert (SMEs) in the Big Data analytics and the data privacy field. The findings show that the guiding principles of transparency, control, trust and value, ensures the validity and feasibility of the PDS. Furthermore, user-?centricity provides greater control within the Big Data continuum. However, as personal data should not be trusted in the hands of third-?parties, identity management and security must be entrenched at a foundational level of the model. The remaining elements -? selective disclosure, purpose and duration, signalling and data portability - is in fact value adding qualities that allows for the commodification of personal data. In the age of the Internet of Things (IoT), organisations churn out increasing volumes of transactional data, capturing trillions of bytes of information about their customers, suppliers and operations. However, amplifying the rate of technological disruption with the failure to provide safe spaces where individuals can think free, divergent and creative thoughts will significantly diminish the progress organisations (and society) can enjoy.
Mini Dissertation (MBA)--University of Pretoria, 2017.
ms2017
Gordon Institute of Business Science (GIBS)
MBA
Unrestricted

APA, Harvard, Vancouver, ISO, and other styles

39

Nan, Lihao. "Privacy Preserving Representation Learning For Complex Data." Thesis, The University of Sydney, 2019. http://hdl.handle.net/2123/20662.

Full text

Abstract:

Here we consider a common data encryption problem encountered by users who want to disclose some data to gain utility but preserve their private information. Specifically, we consider the inference attack, in which an adversary conducts inference on the disclosed data to gain information about users' private data. Following privacy funnel \cite{makhdoumi2014information}, assuming that the original data $X$ is transformed into $Z$ before disclosing and the log loss is used for both privacy and utility metrics, then the problem can be modeled as finding a mapping $X \rightarrow Z$ that maximizes mutual information between $X$ and $Z$ subject to a constraint that the mutual information between $Z$ and private data $S$ is smaller than a predefined threshold $\epsilon$. In contrast to the original study \cite{makhdoumi2014information}, which only focused on discrete data, we consider the more general and practical setting of continuous and high-dimensional disclosed data (e.g., image data). Most previous work on privacy-preserving representation learning is based on adversarial learning or generative adversarial networks, which has been shown to suffer from the vanishing gradient problem, and it is experimentally difficult to eliminate the relationship with private data $Y$ when $Z$ is constrained to retain more information about $X$. Here we propose a simple but effective variational approach that does not rely on adversarial training. Our experimental results show that our approach is stable and outperforms previous methods in terms of both downstream task accuracy and mutual information estimation.

APA, Harvard, Vancouver, ISO, and other styles

40

Huang, Xueli. "Achieving Data Privacy and Security in Cloud." Diss., Temple University Libraries, 2016. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/372805.

Full text

Abstract:

Computer and Information Science
Ph.D.
The growing concerns in term of the privacy of data stored in public cloud have restrained the widespread adoption of cloud computing. The traditional method to protect the data privacy is to encrypt data before they are sent to public cloud, but heavy computation is always introduced by this approach, especially for the image and video data, which has much more amount of data than text data. Another way is to take advantage of hybrid cloud by separating the sensitive data from non-sensitive data and storing them in trusted private cloud and un-trusted public cloud respectively. But if we adopt the method directly, all the images and videos containing sensitive data have to be stored in private cloud, which makes this method meaningless. Moreover, the emergence of the Software-Defined Networking (SDN) paradigm, which decouples the control logic from the closed and proprietary implementations of traditional network devices, enables researchers and practitioners to design new innovative network functions and protocols in a much easier, flexible, and more powerful way. The data plane will ask the control plane to update flow rules when the data plane gets new network packets with which it does not know how to deal with, and the control plane will then dynamically deploy and configure flow rules according to the data plane's requests, which makes the whole network could be managed and controlled efficiently. However, this kind of reactive control model could be used by hackers launching Distributed Denial-of-Service (DDoS) attacks by sending large amount of new requests from the data plane to the control plane. For image data, we divide the image is into pieces with equal size to speed up the encryption process, and propose two kinds of method to cut the relationship between the edges. One is to add random noise in each piece, the other is to design a one-to-one mapping function for each piece to map different pixel value into different another one, which cuts off the relationship between pixels as well the edges. Our mapping function is given with a random parameter as inputs to make each piece could randomly choose different mapping. Finally, we shuffle the pieces with another random parameter, which makes the problems recovering the shuffled image to be NP-complete. For video data, we propose two different methods separately for intra frame, I-frame, and inter frame, P-frame, based on their different characteristic. A hybrid selective video encryption scheme for H.264/AVC based on Advanced Encryption Standard (AES) and video data themselves is proposed for I-frame. For each P-slice of P-frame, we only abstract small part of them in private cloud based on the characteristic of intra prediction mode, which efficiently prevents P-frame being decoded. For cloud running with SDN, we propose a framework to keep the controller away from DDoS attack. We first predict the amount of new requests for each switch periodically based on its previous information, and the new requests will be sent to controller if the predicted total amount of new requests is less than the threshold. Otherwise these requests will be directed to the security gate way to check if there is a attack among them. The requests that caused the dramatic decrease of entropy will be filter out by our algorithm, and the rules of these request will be made and sent to controller. The controller will send the rules to each switch to make them direct the flows matching with the rules to honey pot.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

41

Brown, Emily Elizabeth. "Adaptable Privacy-preserving Model." Diss., NSUWorks, 2019. https://nsuworks.nova.edu/gscis_etd/1069.

Full text

Abstract:

Current data privacy-preservation models lack the ability to aid data decision makers in processing datasets for publication. The proposed algorithm allows data processors to simply provide a dataset and state their criteria to recommend an xk-anonymity approach. Additionally, the algorithm can be tailored to a preference and gives the precision range and maximum data loss associated with the recommended approach. This dissertation report outlined the research’s goal, what barriers were overcome, and the limitations of the work’s scope. It highlighted the results from each experiment conducted and how it influenced the creation of the end adaptable algorithm. The xk-anonymity model built upon two foundational privacy models, the k-anonymity and l-diversity models. Overall, this study had many takeaways on data and its power in a dataset.

APA, Harvard, Vancouver, ISO, and other styles

42

Huang, Zhengli. "Privacy and utility analysis of the randomization approach in Privacy-Preserving Data Publishing." Related electronic resource: Current Research at SU : database of SU dissertations, recent titles available full text, 2008. http://wwwlib.umi.com/cr/syr/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Olsson, Mattias. "Klassificeringsalgoritmer vs differential privacy : Effekt på klassificeringsalgoritmer vid användande av numerisk differential privacy." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-15680.

Full text

Abstract:

Data mining är ett samlingsnamn för ett antal tekniker som används för att analysera datamängder och finna mönster, exempelvis genom klassificering. Anonymisering innefattar en rad tekniker för att skydda den personliga integriteten. Den här studien undersöker hur stor påverkansgrad anonymisering med tekniken differential privacy har på möjligheten att klassificera en datamängd. Genom ett experiment undersöks ett antal magnituder av anonymisering och vilken effekt de har på möjligheten att klassificera data. Klassificering av den anonymiserade datamängden jämförs mot klassificering av den råa datamängden. Liknande studier har genomförts med k-anonymitet som anonymiseringsteknik där möjligheten att klassificera förbättrades genom generalisering. Resultatet från den här studien å andra sidan visar att möjligheten att klassificera sjunker något, vilket beror på att differential privacy sprider ut informationen i datamängden över ett bredare spektrum. Detta försvårar generellt för klassificeringsalgoritmerna att hitta karakteriserande mönster i datamängden och de lyckas därmed inte få lika hög grad av korrekt klassificering.

APA, Harvard, Vancouver, ISO, and other styles

44

Sehatkar, Morvarid. "Towards a Privacy Preserving Framework for Publishing Longitudinal Data." Thesis, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31629.

Full text

Abstract:

Recent advances in information technology have enabled public organizations and corporations to collect and store huge amounts of individuals' data in data repositories. Such data are powerful sources of information about an individual's life such as interests, activities, and finances. Corporations can employ data mining and knowledge discovery techniques to extract useful knowledge and interesting patterns from large repositories of individuals' data. The extracted knowledge can be exploited to improve strategic decision making, enhance business performance, and improve services. However, person-specific data often contain sensitive information about individuals and publishing such data poses potential privacy risks. To deal with these privacy issues, data must be anonymized so that no sensitive information about individuals can be disclosed from published data while distortion is minimized to ensure usefulness of data in practice. In this thesis, we address privacy concerns in publishing longitudinal data. A data set is longitudinal if it contains information of the same observation or event about individuals collected at several points in time. For instance, the data set of multiple visits of patients of a hospital over a period of time is longitudinal. Due to temporal correlations among the events of each record, potential background knowledge of adversaries about an individual in the context of longitudinal data has specific characteristics. None of the previous anonymization techniques can effectively protect longitudinal data against an adversary with such knowledge. In this thesis we identify the potential privacy threats on longitudinal data and propose a novel framework of anonymization algorithms in a way that protects individuals' privacy against both identity disclosure and attribute disclosure, and preserves data utility. Particularly, we propose two privacy models: (K,C)^P -privacy and (K,C)-privacy, and for each of these models we propose efficient algorithms for anonymizing longitudinal data. An extensive experimental study demonstrates that our proposed framework can effectively and efficiently anonymize longitudinal data.

APA, Harvard, Vancouver, ISO, and other styles

45

Stroud, Caleb Zachary. "Implementing Differential Privacy for Privacy Preserving Trajectory Data Publication in Large-Scale Wireless Networks." Thesis, Virginia Tech, 2018. http://hdl.handle.net/10919/84548.

Full text

Abstract:

Wireless networks collect vast amounts of log data concerning usage of the network. This data aids in informing operational needs related to performance, maintenance, etc., but it is also useful for outside researchers in analyzing network operation and user trends. Releasing such information to these outside researchers poses a threat to privacy of users. The dueling need for utility and privacy must be addressed. This thesis studies the concept of differential privacy for fulfillment of these goals of releasing high utility data to researchers while maintaining user privacy. The focus is specifically on physical user trajectories in authentication manager log data since this is a rich type of data that is useful for trend analysis. Authentication manager log data is produced when devices connect to physical access points (APs) and trajectories are sequences of these spatiotemporal connections from one AP to another for the same device. The fulfillment of this goal is pursued with a variable length n-gram model that creates a synthetic database which can be easily ingested by researchers. We found that there are shortcomings to the algorithm chosen in specific application to the data chosen, but differential privacy itself can still be used to release sanitized datasets while maintaining utility if the data has a low sparsity.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

46

Sallaku, Redlon <1994&gt. "Privacy and Protecting Privacy: Using Static Analysis for legal compliance. General Data Protection Regulation." Master's Degree Thesis, Università Ca' Foscari Venezia, 2019. http://hdl.handle.net/10579/14682.

Full text

Abstract:

The main purpose of the thesis is to study Privacy and how protecting Privacy, including the new regulation framework proposed by EU the GDPR, investigating how static analysis could help GDPR enforcement, and develop a new static analysis prototype to fulfill this task in practice. GDPR (General Data Protection Regulation) is a recent European regulation to harmonize and enforce data privacy laws across Europe, to protect and empower all EU citizens data privacy, and to reshape the way organizations deal with sensitive data. This regulation has been enforced starting from May 2018. While it is already clear that there is no unique solution to deal with the whole spectrum of GDPR, it is still unclear how static analysis might help enterprises to fulfill the constraints imposed by this regulation.

APA, Harvard, Vancouver, ISO, and other styles

47

Casas, Roma Jordi. "Privacy-preserving and data utility in graph mining." Doctoral thesis, Universitat Autònoma de Barcelona, 2014. http://hdl.handle.net/10803/285566.

Full text

Abstract:

En los últimos años, ha sido puesto a disposición del público una gran cantidad de los datos con formato de grafo. Incrustado en estos datos hay información privada acerca de los usuarios que aparecen en ella. Por lo tanto, los propietarios de datos deben respetar la privacidad de los usuarios antes de liberar los conjuntos de datos a terceros. En este escenario, los procesos de anonimización se convierten en un proceso muy importante. Sin embargo, los procesos de anonimización introducen, generalmente, algún tipo de ruido en los datos anónimos y también en sus resultados en procesos de minería de datos. Generalmente, cuanto mayor la privacidad, mayor será el ruido. Por lo tanto, la utilidad de los datos es un factor importante a tener en cuenta en los procesos de anonimización. El equilibrio necesario entre la privacidad de datos y utilidad de éstos puede mejorar mediante el uso de medidas y métricas para guiar el proceso de anonimización, de tal forma que se minimice la pérdida de información. En esta tesis hemos trabajo los campos de la preservación de la privacidad del usuario en las redes sociales y la utilidad y calidad de los datos publicados. Un compromiso entre ambos campos es un punto crítico para lograr buenos métodos de anonimato, que permitan mejorar los posteriores procesos de minería de datos. Parte de esta tesis se ha centrado en la utilidad de los datos y la pérdida de información. En primer lugar, se ha estudiado la relación entre las medidas de pérdida de información genéricas y las específicas basadas en clustering, con el fin de evaluar si las medidas genéricas de pérdida de información son indicativas de la utilidad de los datos para los procesos de minería de datos posteriores. Hemos encontrado una fuerte correlación entre algunas medidas genéricas de pérdida de información (average distance, betweenness centrality, closeness centrality, edge intersection, clustering coefficient y transitivity) y el índice de precisión en los resultados de varios algoritmos de clustering, lo que demuestra que estas medidas son capaces de predecir el perturbación introducida en los datos anónimos. En segundo lugar, se han presentado dos medidas para reducir la pérdida de información en los procesos de modificación de grafos. La primera, Edge neighbourhood centrality, se basa en el flujo de información de a través de la vecindad a distancia 1 de una arista específica. El segundo se basa en el core number sequence y permite conservar mejor la estructura subyacente, mejorando la utilidad de los datos. Hemos demostrado que ambos métodos son capaces de preservar las aristas más importantes del grafo, manteniendo mejor las propiedades básicas estructurales y espectrales. El otro tema importante de esta tesis ha sido los métodos de preservación de la privacidad. Hemos presentado nuestro algoritmo de base aleatoria, que utiliza el concepto de Edge neighbourhood centrality para guiar el proceso de modificación preservando los bordes más importantes del grafo, logrando una menor pérdida de información y una mayor utilidad de los datos. Por último, se han desarrollado dos algoritmos diferentes para el k-anonimato en los grafos. En primer lugar, se ha presentado un algoritmo basado en la computación evolutiva. Aunque este método nos permite cumplir el nivel de privacidad deseado, presenta dos inconvenientes: la pérdida de información es bastante grande en algunas propiedades estructurales del grafo y no es lo suficientemente rápido para trabajar con grandes redes. Por lo tanto, un segundo algoritmo se ha presentado, que utiliza el micro-agregación univariante para anonimizar la secuencia de grados. Este método es cuasi-óptimo y se traduce en una menor pérdida de información y una mejor utilidad de los datos.
In recent years, an explosive increase of graph-formatted data has been made publicly available. Embedded within this data there is private information about users who appear in it. Therefore, data owners must respect the privacy of users before releasing datasets to third parties. In this scenario, anonymization processes become an important concern. However, anonymization processes usually introduce some kind of noise in the anonymous data, altering the data and also their results on graph mining processes. Generally, the higher the privacy, the larger the noise. Thus, data utility is an important factor to consider in anonymization processes. The necessary trade-off between data privacy and data utility can be reached by using measures and metrics to lead the anonymization process to minimize the information loss, and therefore, to maximize the data utility. In this thesis we have covered the fields of user's privacy-preserving in social networks and the utility and quality of the released data. A trade-off between both fields is a critical point to achieve good anonymization methods for the subsequent graph mining processes. Part of this thesis has focused on data utility and information loss. Firstly, we have studied the relation between the generic information loss measures and the clustering-specific ones, in order to evaluate whether the generic information loss measures are indicative of the usefulness of the data for subsequent data mining processes. We have found strong correlation between some generic information loss measures (average distance, betweenness centrality, closeness centrality, edge intersection, clustering coefficient and transitivity) and the precision index over the results of several clustering algorithms, demonstrating that these measures are able to predict the perturbation introduced in anonymous data. Secondly, two measures to reduce the information loss on graph modification processes have been presented. The first one, Edge neighbourhood centrality, is based on information flow throw 1-neighbourhood of a specific edge in the graph. The second one is based on the core number sequence and it preserves better the underlying graph structure, retaining more data utility. By an extensive experimental set up, we have demonstrated that both methods are able to preserve the most important edges in the network, keeping the basic structural and spectral properties close to the original ones. The other important topic of this thesis has been privacy-preserving methods. We have presented our random-based algorithm, which utilizes the concept of Edge neighbourhood centrality to drive the edge modification process to better preserve the most important edges in the graph, achieving lower information loss and higher data utility on the released data. Our method obtains a better trade-off between data utility and data privacy than other methods. Finally, two different approaches for k-degree anonymity on graphs have been developed. First, an algorithm based on evolutionary computing has been presented and tested on different small and medium real networks. Although this method allows us to fulfil the desired privacy level, it presents two main drawbacks: the information loss is quite large in some graph structural properties and it is not fast enough to work with large networks. Therefore, a second algorithm has been presented, which uses the univariate micro-aggregation to anonymize the degree sequence and reduce the distance from the original one. This method is quasi-optimal and it results in lower information loss and better data utility.

APA, Harvard, Vancouver, ISO, and other styles

48

Rodríguez, García María Mercedes. "Semantic perturbative privacy-preserving methods for nominal data." Doctoral thesis, Universitat Rovira i Virgili, 2017. http://hdl.handle.net/10803/435689.

Full text

Abstract:

L’explotació de microdades personals (p. ex., dades censals, preferències, o registres de salut) és de gran interès per a la mineria de dades. Aquestes dades sovint contenen informació sensible que pot ser directament o indirectament relacionada amb els individus. Per tant, cal implementar mesures per a preservar la privadesa i minimitzar el risc de re-identificació i, conseqüentment, de revelació d’informació confidencial sobre els individus. Tot i que s’han desenvolupat nombroses mètodes per preservar la privadesa de dades numèriques, la protecció de valors nominals ha rebut escassa atenció. Donat que la utilitat d’aquest tipus de dades està estretament relacionada amb la preservació de la seva semàntica, en aquest treball explotem diverses tecnologies semàntiques per fe possible una protecció coherent amb el significat de les dades nominals. Específicament, fem servir ontologies com a base per a proposar un marc de treball semàntic que permeti manegar dades nominals segons en seu significat en tasques de protecció; aquest marc consta d’un conjunt d’operadors que caracteritzen i transformen dades nominals a la vegada que consideren la seva semàntica. A partir d’aquí, fer servir aquest marc per adaptar mètodes pertorbatius de protecció de la privadesa. Particularment, ens centrem en mètodes basats als dos principis subjacents a la protecció de dades: enfocaments basats en permutació, concretament, rank swapping, y addicció de soroll. Els mètodes proposats han estat avaluats extensament amb conjunts de dades reals. Els resultats experimentals mostren que manegar les dades nominals semànticament millora significativament la interpretabilitat i la utilitat dels resultats protegits.
La explotación de microdatos personales (p. ej., datos del censo, preferencias, o registros de salud) es de gran interés para la minería de datos. Tales datos a menudo contienen información sensible que puede ser directa o indirectamente relacionada con los individuos. Por tanto, resulta necesario implementar medidas para preservar la privacidad y para minimizar el riesgo de re-identificación y, por consiguiente, de revelación de información confidencial sobre los individuos. Pese a que se han desarrollado numerosos métodos para preservar la privacidad de datos numéricos, la protección de valores nominales ha recibido escasa atención. Puesto que la utilidad de este tipo de datos está estrechamente relacionada con la preservación de su semántica, en este trabajo explotamos varias tecnologías semánticas para posibilitar una protección coherente con el significado de los datos nominales. Específicamente, utilizamos ontologías como base para proponer un marco de trabajo semántico que permita manejar datos nominales según su significado en tareas de protección; dicho marco consta de un conjunto de operadores que caracterizan y transforman datos nominales a la vez que tienen en consideración su semántica. A partir de aquí, utilizamos este marco para adaptar métodos perturbativos de preservación de la privacidad al dominio nominal. Particularmente, nos centramos en métodos basados en los dos principios subyacentes a la protección de los datos: enfoques basados en permutación, concretamente, rank swapping, and adición de ruido. Los métodos propuestos han sido extensamente evaluados con conjuntos de datos reales. Resultados experimentales muestran que manejar los datos nominales semánticamente mejora significativamente la interpretabilidad y la utilidad de los resultados protegidos.
The exploitation of personal microdata (such as census data, preferences or medical records) is of great interest for the data mining community. Such data often include sensitive information that can be directly or indirectly related to individuals. Therefore, privacy-preserving measures should be undertaken to minimize the risk of re-identification and, hence, of disclosing confidential information on the individuals. In the past, many privacy-preserving methods have been developed to deal with numerical data, but approaches tackling the protection of nominal values are scarce. Since the utility of this kind of data is closely related to the preservation of their semantics, in this work, we exploit several semantic technologies to enable a semantically-coherent protection of nominal data. Specifically, we use ontologies as the ground to propose a semantic framework that enables an appropriate management of nominal data in data protection tasks; such framework consists on a set of operators that characterize and transform nominal data while taking into account their semantics. Then, we use this framework to adapt perturbative privacy-preserving methods to the nominal domain. Specifically, we focus on methods based on the two main principles underlying to data protection: permutation-based approaches, i.e., rank swapping, and noise addition. The proposed methods have been extensively evaluated with real datasets. Experimental results show that a semantically-coherent management of nominal data significantly improves the semantic interpretability and the utility of the protected outcomes.

APA, Harvard, Vancouver, ISO, and other styles

49

An, Nan. "Protect Data Privacy in E-Healthcare in Sweden." Thesis, Växjö University, School of Mathematics and Systems Engineering, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:vxu:diva-1619.

Full text

Abstract:

Sweden healthcare adopted much ICT (information and communication technology). It is a highly information intensive place. This thesis gives a brief description of the background of healthcare in Sweden and ICT adoption in healthcare, introduces an Information system security model, describes the technology and law about data privacy and carries out a case through questionnaire and interview.

APA, Harvard, Vancouver, ISO, and other styles

50

Wang, Hui. "Secure query answering and privacy-preserving data publishing." Thesis, University of British Columbia, 2007. http://hdl.handle.net/2429/31721.

Full text

Abstract:

The last several decades have witnessed a phenomenal growth in the networking infrastructure connecting computers all over the world. The Web has now become an ubiquitous channel for information sharing and dissemination. More and more data is being exchanged and published on the Web. This growth has created a whole new set of research challenges, while giving a new spin to some existing ones. For example, XML(eXtensible Markup Language), a self-describing and semi-structured data format, has emerged as the standard for representing and exchanging data between applications across the Web. An important issue of data publishing is the protection of sensitive and private information. However, security/privacy-enhancing techniques bring disadvantages: security-enhancing techniques may incur overhead for query answering, while privacy-enhancing techniques may ruin data utility. In this thesis, we study how to overcome such overhead. Specifically, we address the following two problems in this thesis: (a) efficient and secure query evaluation over published XML databases, and (b) publishing relational databases while protecting privacy and preserving utility. The first part of this thesis focuses on efficiency and security issues of query evaluation over XML databases. To protect sensitive information in the published database, security policies must be defined and enforced, which will result in unavoidable overhead. Due to the security overhead and the complex structure of XML databases, query evaluation may become inefficient. In this thesis, we study how to securely and efficiently evaluate queries over XML databases. First, we consider the access-controlled database. We focus on a security model by which every XML element either is locally assigned a security level or inherits the security level from one of its ancestors. Security checks in this model can cause considerable overhead for query evaluation. We investigate how to reduce the security overhead by analyzing the subtle interactions between inheritance of security levels and the structure of the XML database. We design a sound and complete set of rules and develop efficient, polynomial-time algorithms for optimizing security checks on queries. Second, we consider encrypted XML database in a "database-as-service" model, in which the private database is hosted by an untrusted server. Since the untrusted server has no decryption key, its power of query processing is very limited, which results in inefficient query evaluation. We study how to support secure and efficient query evaluation in this model. We design the metadata that will be hosted on the server side with the encrypted database. We show that the presence of the metadata not only facilitates query processing but also guarantees data security. We prove that by observing a series of queries from the client and responses by itself, the server's knowledge about the sensitive information in the database is always below a given security threshold. The second part of this thesis studies the problem of preserving both privacy and the utility when publishing relational databases. To preserve utility, the published data will not be perturbed. Instead, the base table in the original database will be decomposed into several view tables. First, we define a general framework to measure the likelihood of privacy breach of a published view. We propose two attack models, unrestricted and restricted models, and derive formulas to quantify the privacy breach for each model. Second, we take utility into consideration. Specifically, we study the problem of how to design the scheme of published views, so that data privacy is protected while maximum utility is guaranteed. Given a database and its scheme, there are exponentially many candidates for published views that satisfy both privacy and utility constraints. We prove that finding the globally optimal safe and faithful view, i.e., the view that does not violate any privacy constraints and provides the maximum utility, is NP-hard. We propose the locally optimal safe and faithful view as the heuristic, and show how we can efficiently find a locally optimal safe and faithful view in polynomial time.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Data privacy'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles