Learning Your Identity and Disease from Research Papers: Information Leaks in Genome Wide Association Study

Loading...
Thumbnail Image

Other Version

External File or Record

Can’t use the file because of accessibility barriers? Contact us

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Genome-wide association studies (GWAS) aim at discovering the association between genetic variations, particularly single-nucleotide polymorphism (SNP), and common diseases, which have been well recognized to be one of the most important and active areas in biomedical research. Also renowned is the privacy implication of such studies, which has been brought into the limelight by the recent attack proposed by Homer et al. Homer's attack demonstrates that it is possible to identify a participant of a GWAS from analyzing the allele frequencies of a large number of SNPs. Such a threat, unfortunately, was found in our research to be significantly understated. In this paper, we demonstrate that individuals can actually be identified from even a relatively small set of statistics, as those routinely published in GWAS papers. We present two attacks. The first one extends Homer's attack with a much more powerful test statistic, based on the correlations among different SNPs described by coefficient of determination ($r^2$). This attack can determine the presence of an individual in a GWAS from the statistics related to a couple of hundred SNPs. The second attack can lead to complete disclosure of hundreds of the participants' SNPs, by analyzing the information derived from the published statistics. We also found that those attacks can succeed even when the precisions of the statistics are low and part of data is missing, which makes the effects of such simple defense limited. We evaluated our attacks on the real human genomes from the International HapMap project, and concluded that such threats are completely realistic.

Series and Number:

Indiana University Computer Science Technical Reports; TR680

EducationalLevel:

Is Based On:

Target Name:

Teaches:

Table of Contents

Description

Keywords

Citation

Journal

DOI

Rights

This work is protected by copyright unless stated otherwise.

Type