Дисертації з теми "Private data publishing"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-28 дисертацій для дослідження на тему "Private data publishing".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Zhang, Yihua. "ON DATA UTILITY IN PRIVATE DATA PUBLISHING." Miami University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=miami1272986770.
Повний текст джерелаShang, Hui. "Privacy Preserving Kin Genomic Data Publishing." Miami University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=miami1594835227299524.
Повний текст джерелаLin, Zehua. "Privacy Preserving Social Network Data Publishing." Miami University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=miami1610045108271476.
Повний текст джерелаLoukides, Grigorios. "Data utility and privacy protection in data publishing." Thesis, Cardiff University, 2008. http://orca.cf.ac.uk/54743/.
Повний текст джерелаChen, Xiaoqiang. "Privacy Preserving Data Publishing for Recommender System." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-155785.
Повний текст джерелаSehatkar, Morvarid. "Towards a Privacy Preserving Framework for Publishing Longitudinal Data." Thesis, Université d'Ottawa / University of Ottawa, 2014. http://hdl.handle.net/10393/31629.
Повний текст джерелаWang, Hui. "Secure query answering and privacy-preserving data publishing." Thesis, University of British Columbia, 2007. http://hdl.handle.net/2429/31721.
Повний текст джерелаScience, Faculty of
Computer Science, Department of
Graduate
Huang, Zhengli. "Privacy and utility analysis of the randomization approach in Privacy-Preserving Data Publishing." Related electronic resource: Current Research at SU : database of SU dissertations, recent titles available full text, 2008. http://wwwlib.umi.com/cr/syr/main.
Повний текст джерелаHajian, Sara. "Simultaneous discrimination prevention and privacy protection in data publishing and mining." Doctoral thesis, Universitat Rovira i Virgili, 2013. http://hdl.handle.net/10803/119651.
Повний текст джерелаYang, Cao. "Rigorous and Flexible Privacy Protection Framework for Utilizing Personal Spatiotemporal Data." 京都大学 (Kyoto University), 2017. http://hdl.handle.net/2433/225733.
Повний текст джерелаJafer, Yasser. "Task Oriented Privacy-preserving (TOP) Technologies Using Automatic Feature Selection." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/34320.
Повний текст джерелаLi, Yidong. "Preserving privacy in data publishing and analysis." Thesis, 2011. http://hdl.handle.net/2440/68556.
Повний текст джерелаThesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2011
"Privacy preserving data publishing." Thesis, 2008. http://library.cuhk.edu.hk/record=b6074672.
Повний текст джерелаThis thesis prevents an extensive study on the anonymization techniques for privacy preserving data publishing. We explore various aspects of the problem (e.g., definitions of privacy, modeling of the adversary, methodologies of anonymization), and devise novel solutions that address several important issues overlooked by previous work. Experiments with real-world data confirm the effectiveness and efficiency of our techniques.
Xiao, Xiaokui.
Adviser: Yufei Yao.
Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: 3618.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2008.
Includes bibliographical references (leaves 307-314).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Electronic reproduction. [Ann Arbor, MI] : ProQuest Information and Learning, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts in English and Chinese.
School code: 1307.
Iftikhar, Masooma. "Privacy-Preserving Data Publishing." Phd thesis, 2022. http://hdl.handle.net/1885/272877.
Повний текст джерелаCao, Ming. "Privacy Protection on RFID Data Publishing." Thesis, 2009. http://spectrum.library.concordia.ca/976641/1/MR63109.pdf.
Повний текст джерелаChen, Rui. "Toward Privacy in High-Dimensional Data Publishing." Thesis, 2012. http://spectrum.library.concordia.ca/974691/4/Chen_PhD_F2012.pdf.
Повний текст джерела"Preservation of privacy in sensitive data publishing." 2008. http://library.cuhk.edu.hk/record=b5893631.
Повний текст джерелаThesis (M.Phil.)--Chinese University of Hong Kong, 2008.
Includes bibliographical references (leaves [105]-110).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgement --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Problem Statement --- p.1
Chapter 1.2 --- Contributions --- p.3
Chapter 1.3 --- Thesis Organization --- p.5
Chapter 2 --- Background Study --- p.7
Chapter 2.1 --- Generalization Algorithms --- p.7
Chapter 2.2 --- Privacy Principles --- p.10
Chapter 2.3 --- Other Related Research --- p.11
Chapter 3 --- Anti-Corruption Privacy Preserving Publication --- p.13
Chapter 3.1 --- Motivation --- p.13
Chapter 3.2 --- Problem Settings --- p.14
Chapter 3.3 --- Defects of Generalization --- p.18
Chapter 3.4 --- Perturbed Generalization --- p.23
Chapter 3.5 --- Modeling Privacy Attacks --- p.26
Chapter 3.5.1 --- Corruption-Aided Linking Attacks --- p.26
Chapter 3.5.2 --- Posterior Confidence Derivation --- p.28
Chapter 3.6 --- Formal Results --- p.30
Chapter 3.7 --- Experiments --- p.34
Chapter 3.8 --- Summary --- p.37
Chapter 4 --- Preservation of Proximity Privacy --- p.39
Chapter 4.1 --- Motivation --- p.39
Chapter 4.2 --- Formalization --- p.40
Chapter 4.2.1 --- Privacy Attacks --- p.41
Chapter 4.2.2 --- "(ε, m)-Anonymity" --- p.42
Chapter 4.3 --- Inadequacy of the Existing Methods --- p.44
Chapter 4.3.1 --- Inadequacy of Generalization Principles --- p.45
Chapter 4.3.2 --- Inadequacy of Perturbation --- p.49
Chapter 4.4 --- "Characteristics of (Epsilon, m) Anonymity" --- p.51
Chapter 4.4.1 --- A Reduction --- p.51
Chapter 4.4.2 --- Achievable Range of m Given e1and e2 --- p.53
Chapter 4.4.3 --- Achievable e1 and e2 Given m --- p.57
Chapter 4.4.4 --- Selecting the Parameters --- p.60
Chapter 4.5 --- Generalization Algorithm --- p.61
Chapter 4.5.1 --- Non-Monotonicity and Predictability --- p.61
Chapter 4.5.2 --- The Algorithm --- p.63
Chapter 4.6 --- Experiments --- p.65
Chapter 4.7 --- Summary --- p.70
Chapter 5 --- Privacy Preserving Publication for Multiple Users --- p.71
Chapter 5.1 --- Motivation --- p.71
Chapter 5.2 --- Problem Definition --- p.74
Chapter 5.2.1 --- K-Anonymity --- p.75
Chapter 5.2.2 --- An Observation --- p.76
Chapter 5.3 --- The Butterfly Method --- p.78
Chapter 5.3.1 --- The Butterfly Structure --- p.78
Chapter 5.3.2 --- Anonymization Algorithm --- p.83
Chapter 5.4 --- Extensions --- p.89
Chapter 5.4.1 --- Handling More Than Two QIDs --- p.89
Chapter 5.4.2 --- Handling Collusion --- p.91
Chapter 5.5 --- Experiments --- p.93
Chapter 5.6 --- Summary --- p.101
Chapter 6 --- Conclusions and Future Work --- p.102
Chapter A --- List of Publications --- p.104
Bibliography --- p.105
HSIAO, MEI-HUI, and 蕭美慧. "Privacy-Preserving Data Publishing with Missing Values." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/7t7u9u.
Повний текст джерела國立高雄大學
資訊工程學系碩士班
105
Recently, privacy preserving data publishing has become an important research issue. Over the past few years, although there have been many different privacy preserving data anonymization methods proposed by the researchers, all of them are dealing with non-missing data. However, in the real world most published data contain missing values. None of contemporary work notices this problem and investigates the effect of missing values to privacy preserving data publishing. The aim of this research was to discuss the impact of missing values to current privacy preserving anonymization methods and propose appropriate solutions. We investigate possible strategies as well as the deficiencies for adopting contemporary anonymization methods for missing values. Accordingly, we propose a new strategy and two privacy protection models, called Closed k-anonymity and Closed l-diversity. Closed k-anonymity can prevent record linkage attack, while Closed l-diversity can prevent attribute linkage attack. We also propose two corresponding algorithms, called Closed k-anonymization and Closed l-diversification. In the last, we compare our methods with the famous k-anonymity and l-diversity, evaluating their performances by measuring the information loss, privacy risk and data utility on anonymizing two real datasets, including census data and FAERS data. Experimental results show that our methods can effectively anonymize data with missing values, not only preventing privacy disclosure but also sustaining the data utility and analyzed results.
Al-Hussaeni, Khalil. "Preserving Data Privacy and Information Usefulness for RFID Data Publishing." Thesis, 2009. http://spectrum.library.concordia.ca/976457/1/MR63082.pdf.
Повний текст джерелаYang, Duen-Chuan, and 楊敦筌. "Privacy Preserving Data Publishing Techniques for Spontaneous Reporting System Data." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/53521333395278780908.
Повний текст джерела國立高雄大學
資訊工程學系碩士班
103
In recent years, spontaneous reporting systems (SRSs) have been widely established to collect adverse drug events (ADEs) for ADR detection and analysis, e.g., the FDA Adverse Event Reporting System (FAERS). Usually, SRS data contain sensitive personal health information that should be protected to prevent the identification of individuals, raising the need of anonymizing the raw data before being published, namely privacy-preserving data publishing (PPDP). Although much work has been done on PPDP, very few studies have focused on protecting privacy of SRS data. In this thesis, we present the problem of and research issues for anonymizing spontaneous ADE reporting data for privacy-preserving ADR signal detection first. Four main characteristics of spontaneous ADE data are identified, including rare ADE events, multiple individual records, multi-valued sensitive attribute, and missing values. We examine the feasibility of contemporary privacy-preserving models for anonymizing SRS datasets, showing their incompetence in handling these issues and so arouse the need of new privacy models and data anonymizing methods. Therefore, we present a new privacy-preserving model, called MS(k,
WANG, CHIEH-TENG, and 王介騰. "Privacy Preserving Anonymity for Periodical SRS Data Publishing." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/16278646066845717875.
Повний текст джерела國立高雄大學
資訊工程學系碩士班
104
In recent years, many countries have built their spontaneous reporting systems to collect adverse drug events for ADR detection and analysis, e.g., the FDA Adverse Event Reporting System (FAERS). The SRS data are provided to the researchers, even open to the public, to foster the research of ADR. Normally, SRS data contains personal information and some private value such as indication. Thus, it is necessary to de-identify the SRS data for prevent the disclosure of individual privacy before it is published. However, researchers have pointed out that it is not enough to protect personal privacy by de-identifying personal identity. To publish data in a more safe way, the technique of privacy-preserving data publishing (PPDP) has attracted lots of attention gradually. Although there have been many different PPDP models (privacy models) proposed by the researchers, they are not suitable for protecting SRS data from disclosure due to some features of SRS data. As such, we have proposed a privacy model called MS(k, θ*)-bounding and the associated algorithm MS-Anonymization in our previous work. In the real world the SRS data is dynamically growing and needs to be published periodically, which thwarts our single-release-focus method, i.e., MS(k, θ*)-bounding, causing some cracks of anonymization to the attacker. In this research, we investigated the attacks on periodically published SRS data and proposed a new privacy model called PPMS(k, θ*)-bounding and the associated algorithm PPMS-Anonymization. Experimental results on FAERS dataset show that our new method can prevent privacy disclosure from the attacks in periodical data publishing scenario with reasonable sacrifice of data utility and acceptable deviation to the strength of ADR signals.
HSU, KUANG-YUNG, and 許絖詠. "Privacy-Preserving SRS Data Publishing with Missing Values." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/vy754k.
Повний текст джерела國立高雄大學
資訊工程學系碩士班
106
In recent years, many countries have established Spontaneous Reporting System (SRS) for the detection and analysis of adverse drug reactions (ADRs), such as the US Food and Drug Administration's Adverse Event Reporting System (FAERS). These SRS data usually contain sensitive personal privacy information. In order to prevent personal privacy leakage, the data must be de-identified and processed by some Privacy Protection Data Publishing (PPDP) before being published. Although many scholars have proposed various privacy protection models, they overlooked characteristics of SRS data. Therefore, our lab proposed a feasible privacy model MS(k, θ*)-bounding dedicate to SRS data and corresponding anonymization method MS-Anonymization. However, this method is only applicable to complete data, not considering the fact that there is lot amount of missing data. On the other hand, our lab proposed privacy models for handling missing values, Closed k-anonymity and Closed l-diversity, but not dedicate to the characteristics of SRS data. Therefore, in this thesis, we propose a new privacy model Closed MS(k, θ*)-bounding, which combines MS(k, θ*)-bounding and Closed k-anonymity and Closed l-diversity, and propose three new anonymization methods, Closed-MSpartition, Closed-MSdirect, Closed-MSsorting, to process SRS data with missing values. We used FAERS data to test and compare our three methods from the aspects of information loss, privacy risk, and data utility. The results show that Closed-MSdirect has better performance on information distortion, privacy exposure risk and data utility. Although Closed-MSpartition and Closed-MSsorting have higher information loss and privacy risk, and lower data utility than Closed-MSdirect, the results are still in acceptable range. In summary, in the case of a large proportion of SRS contain missing values, our proposed new methods can effectively prevent attackers from learning personal privacy.
"Privacy preserving in serial data and social network publishing." 2010. http://library.cuhk.edu.hk/record=b5894365.
Повний текст джерела"August 2010."
Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.
Includes bibliographical references (p. 69-72).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 2 --- Related Work --- p.3
Chapter 3 --- Privacy Preserving Network Publication against Structural Attacks --- p.5
Chapter 3.1 --- Background and Motivation --- p.5
Chapter 3.1.1 --- Adversary knowledge --- p.6
Chapter 3.1.2 --- Targets of Protection --- p.7
Chapter 3.1.3 --- Challenges and Contributions --- p.10
Chapter 3.2 --- Preliminaries and Problem Definition --- p.11
Chapter 3.3 --- Solution:K-Isomorphism --- p.15
Chapter 3.4 --- Algorithm --- p.18
Chapter 3.4.1 --- Refined Algorithm --- p.21
Chapter 3.4.2 --- Locating Vertex Disjoint Embeddings --- p.30
Chapter 3.4.3 --- Dynamic Releases --- p.32
Chapter 3.5 --- Experimental Evaluation --- p.34
Chapter 3.5.1 --- Datasets --- p.34
Chapter 3.5.2 --- Data Structure of K-Isomorphism --- p.37
Chapter 3.5.3 --- Data Utilities and Runtime --- p.42
Chapter 3.5.4 --- Dynamic Releases --- p.47
Chapter 3.6 --- Conclusions --- p.47
Chapter 4 --- Global Privacy Guarantee in Serial Data Publishing --- p.49
Chapter 4.1 --- Background and Motivation --- p.49
Chapter 4.2 --- Problem Definition --- p.54
Chapter 4.3 --- Breach Probability Analysis --- p.57
Chapter 4.4 --- Anonymization --- p.58
Chapter 4.4.1 --- AG size Ratio --- p.58
Chapter 4.4.2 --- Constant-Ratio Strategy --- p.59
Chapter 4.4.3 --- Geometric Strategy --- p.61
Chapter 4.5 --- Experiment --- p.62
Chapter 4.5.1 --- Dataset --- p.62
Chapter 4.5.2 --- Anonymization --- p.63
Chapter 4.5.3 --- Evaluation --- p.64
Chapter 4.6 --- Conclusion --- p.68
Bibliography --- p.69
CHANG, YU-HSIANG, and 張煜祥. "Privacy-Preserving High Dimensional Data Publishing Mechanism Meets K-Anonymity and Differential Privacy." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/7n3k93.
Повний текст джерелаKhokhar, Rashid Hussain. "Quantifying the Costs and Benefits of Privacy-Preserving Health Data Publishing." Thesis, 2013. http://spectrum.library.concordia.ca/977136/1/Khokhar_MASc_S2013.pdf.
Повний текст джерела"Privacy preserving data publishing: an expected gain model with negative association immunity." 2012. http://library.cuhk.edu.hk/record=b5549584.
Повний текст джерела本論文著重於隱私保護數據發布之隱私模型及算法。我們首先提出一個預期收益模型,以確認發布一個數據庫會否侵犯個人隱私。預期收益模型符合我們在本論文中提出的六個關於量化私人信息之公理,而第六條公理還會以社會心理學之角度考慮人為因素。而且,這模型考慮敵意信息收集人在發布數據庫之中所得到的好處。所以這模型切實反映出敵意信息收集人利用這些好處而獲得利益,而其他隱私模型並沒有考慮這點。然後,我們還提出了一個算法來生成符合預期收益模型之發布數據庫。我們亦進行了一些包含現實數據庫之實驗來表示出這算法是現實可行的。在那之後,我們提出了一個敏感值抑制算法,使發布數據庫能對負向關聯免疫,而負向關聯是前景/背景知識攻擊之一種。我們亦進行了一些實驗來表示出我們只需要抑制平均數個百份比之敏感值就可以令一個發佈數據庫對負向關聯免疫。最後,我們探討在分散環境之下之隱私保護數據發布,這代表有兩個或以上的數據庫持有人分別生成不同但有關之發布數據庫。我們提出一個在分散環境下可用的相異L多樣性的隱私模型和一個算法來生成符合此模型之發布數據庫。我們亦進行了一些實驗來表示出這算法是現實可行的。
Privacy preserving is an important issue in many applications, especially for the applications that involve human. In privacy preserving data publishing (PPDP), we study how to publish a database, which contains data records of some individuals, so that the privacy of the individuals is preserved while the published database still contains useful information for research or data analysis.
This thesis focuses on privacy models and algorithms in PPDP. We first propose an expected gain model to define whether privacy is preserved for publishing a database. The expected gain model satisfies the six axioms in quantifying private information proposed in this thesis, where the sixth axiom considers human factors in the view of social psychology. In addition, it considers the amount of advantage gained by an adversary by exploiting the private information deduced from a published database. Hence, the model reflects the reality that the adversary uses such an advantage to earn a profit, which is not conisidered by other existing privacy models. Then, we propose an algorithm to generate published databases that satisfy the expected gain model. Experiments on real datasets are conducted to show that the proposed algorithm is feasible to real applications. After that, we propose a value suppression framework to make the published databases immune to negative association, which is a kind of background / foreground knowledge attacks. Experiments are conducted to show that negative association immunity can be achieved by suppressing only a few percent of sensitive values on average. Finally, we investigate PPDP in a non-centralized environment, in which two or more data holders generate their own different but related published databases. We propose a non-centralized distinct l-diversity requirement as the privacy model and an algorithm to generate published databases for this requirement. Experiments are conducted to show that the proposed algorithm is feasible to real applications.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Cheong, Chi Hong.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2012.
Includes bibliographical references (leaves 186-193).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese.
Abstract --- p.i
Acknowledgement --- p.iv
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Background --- p.1
Chapter 1.2 --- Thesis Contributions and Organization --- p.2
Chapter 1.3 --- Other Related Areas --- p.5
Chapter 1.3.1 --- Privacy Preserving Data Mining --- p.5
Chapter 1.3.2 --- Partition-Based Approach vs. Differential Privacy Approach --- p.5
Chapter 2 --- Expected Gain Model --- p.7
Chapter 2.1 --- Introduction --- p.8
Chapter 2.1.1 --- Background and Motivation --- p.8
Chapter 2.1.2 --- Contributions --- p.11
Chapter 2.2 --- Table Models --- p.12
Chapter 2.2.1 --- Private Table --- p.12
Chapter 2.2.2 --- Published Table --- p.13
Chapter 2.3 --- Private Information Model --- p.14
Chapter 2.3.1 --- Proposition --- p.14
Chapter 2.3.2 --- Private Information and Private Probability --- p.15
Chapter 2.3.3 --- Public Information and Public Probability --- p.18
Chapter 2.3.4 --- Axioms in Quantifying Private Information --- p.20
Chapter 2.4 --- Quantifying Private Information --- p.34
Chapter 2.4.1 --- Expected Gain of a Fair Guessing Game --- p.34
Chapter 2.4.2 --- Analysis --- p.41
Chapter 2.5 --- Tuning the Importance of Opposite Information --- p.48
Chapter 2.6 --- Conclusions --- p.53
Chapter 3 --- Generalized Expected Gain Model --- p.56
Chapter 3.1 --- Introduction --- p.58
Chapter 3.2 --- Table Models --- p.60
Chapter 3.2.1 --- Private Table --- p.62
Chapter 3.2.2 --- Published Table --- p.62
Chapter 3.3 --- Expected Gain Model --- p.63
Chapter 3.3.1 --- Random Variable and Probability Distribution --- p.64
Chapter 3.3.2 --- Public Information --- p.64
Chapter 3.3.3 --- Private Information --- p.65
Chapter 3.3.4 --- Expected Gain Model --- p.66
Chapter 3.4 --- Generalization Algorithm --- p.75
Chapter 3.4.1 --- Generalization Property and Subset Property --- p.75
Chapter 3.4.2 --- Modified Version of Incognito --- p.78
Chapter 3.5 --- Related Work --- p.80
Chapter 3.5.1 --- k-Anonymity --- p.80
Chapter 3.5.2 --- l-Diversity --- p.81
Chapter 3.5.3 --- Confidence Bounding --- p.83
Chapter 3.5.4 --- t-Closeness --- p.84
Chapter 3.6 --- Experiments --- p.85
Chapter 3.6.1 --- Experiment Set 1: Average/Max/Min Expected Gain --- p.85
Chapter 3.6.2 --- Experiment Set 2: Expected Gain Distribution --- p.90
Chapter 3.6.3 --- Experiment Set 3: Modified Version of Incognito --- p.95
Chapter 3.7 --- Conclusions --- p.99
Chapter 4 --- Negative Association Immunity --- p.100
Chapter 4.1 --- Introduction --- p.100
Chapter 4.2 --- Related Work --- p.104
Chapter 4.3 --- Negative Association Immunity and Value Suppression --- p.107
Chapter 4.3.1 --- Negative Association --- p.108
Chapter 4.3.2 --- Negative Association Immunity --- p.111
Chapter 4.3.3 --- Achieving Negative Association Immunity by Value Suppression --- p.114
Chapter 4.4 --- Local Search Algorithm --- p.123
Chapter 4.5 --- Experiments --- p.125
Chapter 4.5.1 --- Settings --- p.125
Chapter 4.5.2 --- Results and Discussions --- p.128
Chapter 4.6 --- Conclusions --- p.129
Chapter 5 --- Non-Centralized Distinct l-Diversity --- p.130
Chapter 5.1 --- Introduction --- p.130
Chapter 5.2 --- Related Work --- p.138
Chapter 5.3 --- Table Models --- p.140
Chapter 5.3.1 --- Private Tables --- p.140
Chapter 5.3.2 --- Published Tables --- p.141
Chapter 5.4 --- Private Information Deduced from Multiple Published Tables --- p.143
Chapter 5.4.1 --- Private Information Deduced by Simple Counting on Each Published Tables --- p.143
Chapter 5.4.2 --- Private Information Deduced from Multiple Published Tables --- p.145
Chapter 5.4.3 --- Probabilistic Table --- p.156
Chapter 5.5 --- Non-Centralized Distinct l-Diversity and Algorithm --- p.158
Chapter 5.5.1 --- Non-centralized Distinct l-diversity --- p.159
Chapter 5.5.2 --- Algorithm --- p.165
Chapter 5.5.3 --- Theorems --- p.171
Chapter 5.6 --- Experiments --- p.174
Chapter 5.6.1 --- Settings --- p.174
Chapter 5.6.2 --- Metrics --- p.176
Chapter 5.6.3 --- Results and Discussions --- p.179
Chapter 5.7 --- Conclusions --- p.181
Chapter 6 --- Conclusions --- p.183
Bibliography --- p.186
Zhang, X. "Toward scalable and cost-effective privacy-preserving big data publishing in cloud computing." Thesis, 2014. http://hdl.handle.net/10453/30324.
Повний текст джерелаBig data and cloud computing are two disruptive trends nowadays, provisioning numerous opportunities to current IT industry and research communities while posing significant challenges on them as well. The massive increase in computing power and data storage capacity provisioned by the cloud and the advances in big data mining and analytics have expanded the scope of information available to businesses, government, and individuals by orders of magnitude. A major obstacle to the adoption of cloud computing in sectors such as health and business for big data analysis is the privacy risk associated with releasing data sets to third-parties in the cloud. The data sets in the sectors mentioned above often contain personal privacy-sensitive data, e.g., electronic health records and financial transaction records, while these data sets can offer significant economic and social benefits if analysed or mined by organizations such as disease research centres. Although some privacy issues are not new, the situation is aggravated due to the features of cloud computing like ubiquitous access and multi-tenancy, and the three V properties of big data, i.e., Volume, Velocity and Variety. Therefore, it is still a significant challenge to achieve privacy-preserving big data publishing in cloud computing. A widely-adopted technique for privacy-preserving data publishing with semantic correctness guarantees is to anonymise data via generalisation, and a bundle of anonymisation approaches have been proposed. However, most existing approaches are either inherently sequential or distributed without directly optimising scalability, thus rendering them unsuitable for data intensive applications and inapplicable to the state-of-the-art parallel and distributed paradigms like MapReduce. In this thesis, we mainly investigate the problem of big data anonymisation for privacy preservation from the perspectives of scalability and cost-effectiveness. The cloud computing advantages including on-demand resource provisioning, rapid elasticity and pay-as-you-go fashion are exploited to address the problem, aiming at gaining high scalability and cost-effectiveness. Specifically, we examine three major phases in the lifecycle of privacy-preserving data publishing or sharing in cloud environments, including data anonymisation, anonymous data update and anonymous data management. Accordingly, a scalable and cost-effective privacy-preserving framework is proposed to provide a holistic conceptual foundation for privacy preservation over big data and enable users to accomplish the full potential of the high scalability, elasticity, and cost-effectiveness of the cloud. We develop a corresponding prototype system consisting of a series of solutions to the scalability issues that lie in the three phases based on MapReduce, the de facto standard for big data processing paradigm at present, for the sake of high scalability, cost-effectiveness and compatibility with other big data mining and analytical tools. In terms of extensive experiments on real-world data sets, this thesis demonstrates that our solutions can significantly improve the scalability and cost-effectiveness of big data privacy preservation compared to existing approaches.
Ho, Shih-Han, and 何是翰. "Maximizing Discriminability on Dynamic Attributes for Privacy-Preserving Data Publishing Using K-Anonymity." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/w2cpcb.
Повний текст джерела國立中興大學
電機工程學系所
107
There are increasing demands on open data for scientific, medical, and social applications. Open data is a new trend and more data are being released for data mining and decision-making. To avoid the leakage of personal privacy caused by the release of data, data must be processed through privacy protection methods before being released. Since the optimization of privacy preserving models like K-anonymity and L-diversity are NP problems, most previous privacy preserving methods trade off privacy preserving and data utility by designing heuristic algorithms to reduce information loss. Different from the previous works, our main idea is that the released data should be privacy protected meanwhile they should provide different levels of discrimination for different individuals, i.e., observers of different backgrounds shall get different levels of information from the released data. For example, data are required to be released for supervision of public administration or financial inspection of foundations. Also, for a mass casualty incident (MCI), the up to date information of injured and ill patients, especially their status and location, should be released for the ambulance and medical staff or the patients’ family for easy of finding the required resource for the patients. However, few privacy protection methods that take into account both privacy and data discrimination in the literature. In the thesis, we study the privacy protection and data discrimination problem and found that the attributes of a dataset could be classified into static and dynamic attributes. Considering the dynamic discrimination privacy preserving problem, we propose a new privacy-preserving model called K_1 K_2-anonymization model. It is to ensure that the equivalence class on static attributes is still K-anonymization while in a equivalence class, the numbers of data with the same dynamic attributes should be less than K_2. If the dynamic attribute values within equivalent classes are similar, there are no solutions because it is hard to differentiate. We propose a clustering-based SimDiv algorithm to make the dynamic attributes within equivalence classes more discriminable as a compromised solution to the K_1 K_2-anonymization problem. To validate the effectiveness, we conduct experiments on a real dataset. The experimental results show that the proposed method outperforms the other methods of similar models in term of discriminability on dynamic attributes.