We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.

(2021). Scalable Distributed Data Anonymization . Retrieved from http://hdl.handle.net/10446/202628

Scalable Distributed Data Anonymization

Facchinetti, Dario;Foresti, Sara;Oldani, Gianluca;Paraboschi, Stefano;Rossi, Matthew;Samarati, Pierangela
2021-01-01

Abstract

We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.
2021
De Capitani Di Vimercati, Sabrina; Facchinetti, Dario; Foresti, Sara; Oldani, Gianluca; Paraboschi, Stefano; Rossi, Matthew; Samarati, Pierangela
File allegato/i alla scheda:
File Dimensione del file Formato  
percom2021.pdf

Solo gestori di archivio

Versione: postprint - versione referata/accettata senza referaggio
Licenza: Licenza default Aisberg
Dimensione del file 361.51 kB
Formato Adobe PDF
361.51 kB Adobe PDF   Visualizza/Apri
Pubblicazioni consigliate

Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10446/202628
Citazioni
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact