Abstract:
Several anonymization techniques, such as
generalization and bucketization,
have been designed for privacy preserving microdata publishing. Recent
work has shown that generalization loses considerable amount of information,
especially for high-dimensional data. Bucketization, on the other hand, does
not prevent membership disclosure and does not apply for data that do not
have a clear separation between quasi-identifying attributes and sensitive
attributes. In this paper, we present a novel technique called slicing, which
partitions the data both horizontally and vertically. We show that slicing
preserves better data utility than generalization and can be used for
membership disclosure protection. Another important advantage of slicing is
that it can handle high-dimensional data. We show how slicing can be used
for attribute disclosure protection and develop an efficient algorithm for
computing the sliced data that obey the ℓ-diversity requirement. Our workload
experiments confirm that slicing preserves better utility than generalization
and is more effective than bucketization in workloads involving the sensitive
attribute. Our experiments also demonstrate that slicing can be used to
prevent membership disclosure.
have been designed for privacy preserving microdata publishing. Recent
work has shown that generalization loses considerable amount of information,
especially for high-dimensional data. Bucketization, on the other hand, does
not prevent membership disclosure and does not apply for data that do not
have a clear separation between quasi-identifying attributes and sensitive
attributes. In this paper, we present a novel technique called slicing, which
partitions the data both horizontally and vertically. We show that slicing
preserves better data utility than generalization and can be used for
membership disclosure protection. Another important advantage of slicing is
that it can handle high-dimensional data. We show how slicing can be used
for attribute disclosure protection and develop an efficient algorithm for
computing the sliced data that obey the ℓ-diversity requirement. Our workload
experiments confirm that slicing preserves better utility than generalization
and is more effective than bucketization in workloads involving the sensitive
attribute. Our experiments also demonstrate that slicing can be used to
prevent membership disclosure.
Algorithm
Used:
Slicing
Algorithms
Advantage
of slicing is its ability to handle high-dimensional data. By
partitioning attributes into columns, slicing reduces the dimensionality of the
data. Each column of the table can be viewed as a sub-table with a lower
dimensionality. Slicing is also different from the approach of publishing
multiple independent sub-tables in that these sub-tables are linked by the
buckets in slicing.
partitioning attributes into columns, slicing reduces the dimensionality of the
data. Each column of the table can be viewed as a sub-table with a lower
dimensionality. Slicing is also different from the approach of publishing
multiple independent sub-tables in that these sub-tables are linked by the
buckets in slicing.