Privacy-Preservation for Publishing Sample Availability Data with Personal Identifiers
Ali Gholami1, Erwin Laure1, Peter Somogyi2
, Ola Spjuth2, Salman Niazi3, and Jim Dowling3
1.Swedish e-Science Research Center and Department of HPCViz and PDC, School of Computer Science and Communication, KTH Royal Institute of Technology
2.Swedish e-Science Research Center and Department of Medical Epidemiology and Biostatistics, Karolinska Institute
3.Swedish e-Science Research Center and School of Information and Communication Technology, KTH Royal Institute of Technology
2.Swedish e-Science Research Center and Department of Medical Epidemiology and Biostatistics, Karolinska Institute
3.Swedish e-Science Research Center and School of Information and Communication Technology, KTH Royal Institute of Technology
Abstract—Medical organizations collect, store and process vast amounts of sensitive information about patients. Easy access to this information by researchers is crucial to improving medical research, but in many institutions, cumbersome security measures and walled-gardens have created a situation where even information about what medical data is out there is not available. One of the main security challenges in this area, is enabling researchers to cross-link different medical studies, while preserving the privacy of the patients involved. In this paper, we introduce a privacy-preserving system for publishing sample availability data that allows researchers to make queries that crosscut different studies. That is, researchers can ask questions such as how many patients have had both diabetes and prostate cancer, where the diabetes and prostate cancer information originates from different clinical registries. We realize our solution by having a two-level anonymiziation mechanism, where our toolkit for publishing availability data first pseudonymizes personal identifiers and then anonymizes sensitive attributes. Our toolkit also includes a web-based server that stores the encrypted pseudonymized sample data and allows researchers to execute cross-linked queries across different study data. We believe that our toolkit contributes a first step to support the privacy preserving publication of data containing personal identifiers.
Index Terms—privacy protection, data encryption, distributed systems, database security
Cite: Mohammad Zavid Parvez, Manoranjan Paul, and Michael Antolovich, "Detection of Pre-stage of Epileptic Seizure by Exploiting Temporal Correlation of EMD Decomposed EEG Signals," Journal of Medical and Bioengineering, Vol. 4, No. 2, pp. 117-125, April 2015. Doi: 10.12720/jomb.4.2.117-125
Index Terms—privacy protection, data encryption, distributed systems, database security
Cite: Mohammad Zavid Parvez, Manoranjan Paul, and Michael Antolovich, "Detection of Pre-stage of Epileptic Seizure by Exploiting Temporal Correlation of EMD Decomposed EEG Signals," Journal of Medical and Bioengineering, Vol. 4, No. 2, pp. 117-125, April 2015. Doi: 10.12720/jomb.4.2.117-125
Array