Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Data utility and privacy protection in data publishing

Loukides, Grigorios 2008. Data utility and privacy protection in data publishing. PhD Thesis, Cardiff University.

[img] PDF - Accepted Post-Print Version
Download (7MB)


Data about individuals is being increasingly collected and disseminated for purposes such as business analysis and medical research. This has raised some privacy concerns. In response, a number of techniques have been proposed which attempt to transform data prior to its release so that sensitive information about the individuals contained within it is protected. A:-Anonymisation is one such technique that has attracted much recent attention from the database research community. A:-Anonymisation works by transforming data in such a way that each record is made identical to at least A: 1 other records with respect to those attributes that are likely to be used to identify individuals. This helps prevent sensitive information associated with individuals from being disclosed, as each individual is represented by at least A: records in the dataset. Ideally, a /c-anonymised dataset should maximise both data utility and privacy protection, i.e. it should allow intended data analytic tasks to be carried out without loss of accuracy while preventing sensitive information disclosure, but these two notions are conflicting and only a trade-off between them can be achieved in practice. The existing works, however, focus on how either utility or protection requirement may be satisfied, which often result in anonymised data with an unnecessarily and/or unacceptably low level of utility or protection. In this thesis, we study how to construct /-anonymous data that satisfies both data utility and privacy protection requirements. We propose new criteria to capture utility and protection requirements, and new algorithms that allow A:-anonymisations with required utility/protection trade-off or guarantees to be generated. Our extensive experiments using both benchmarking and synthetic datasets show that our methods are efficient, can produce A:-anonymised data with desired properties, and outperform the state of the art methods in retaining data utility and providing privacy protection.

Item Type: Thesis (PhD)
Status: Unpublished
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
ISBN: 9781303213465
Date of First Compliant Deposit: 30 March 2016
Last Modified: 12 Jun 2019 02:52

Actions (repository staff only)

Edit Item Edit Item


Downloads per month over past year

View more statistics