Privacy-Preserving Biomedical Database Queries with Optimal Privacy-Utility Trade-Offs.

Cell Syst
Authors
Abstract

Sharing data across research groups is an essential driver of biomedical research. While interactive query-answering systems for biomedical databases aim to facilitate the sharing of aggregate insights without divulging sensitive individual-level data, query answers can still leak private information about the individuals in the database. Here, we draw upon recent advances in differential privacy to introduce query-answering mechanisms that provably maximize the utility (e.g., accuracy) of the system while achieving formal privacy guarantees. We demonstrate our accuracy improvement over existing approaches for a range of use cases, including cohort discovery, variant lookup, and association testing. Our new theoretical results extend the proof of optimality of the underlying mechanism, previously known only for count queries with symmetric utility functions, to more general utility functions needed for key biomedical research workflows. Our work presents a path toward interactive biomedical databases that achieve the optimal privacy-utility trade-offs permitted by the theory of differential privacy.

Year of Publication
2020
Journal
Cell Syst
Date Published
2020 Apr 28
ISSN
2405-4720
DOI
10.1016/j.cels.2020.03.006
PubMed ID
32359425
Links