We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.
(2021). Scalable Distributed Data Anonymization . Retrieved from http://hdl.handle.net/10446/202628
Scalable Distributed Data Anonymization
Facchinetti, Dario;Foresti, Sara;Oldani, Gianluca;Paraboschi, Stefano;Rossi, Matthew;Samarati, Pierangela
2021-01-01
Abstract
We present an approach for enabling a distributed anonymization process over large collections of sensor data. Our approach anonymizes large datasets (which might not fit in main memory) using an arbitrary number of workers within the Spark framework. We describe how to parallelize the anonymization process through a proper partitioning of the dataset. Our experimental evaluation shows that the proposed approach is scalable and do not affect the quality of the anonymized dataset.File allegato/i alla scheda:
File | Dimensione del file | Formato | |
---|---|---|---|
percom2021.pdf
Solo gestori di archivio
Versione:
postprint - versione referata/accettata senza referaggio
Licenza:
Licenza default Aisberg
Dimensione del file
361.51 kB
Formato
Adobe PDF
|
361.51 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo