Selecting variant masks to improve power and replicability of gene-level burden tests.

Research square
Authors
Abstract

Rare coding variant association studies typically perform gene-level association tests in which variants are filtered (or "masked") and aggregated based on functional annotation and allele frequency. As there is little research and no consensus regarding masking strategies to use, we investigated the impact of masking strategies on gene-level burden tests, the most widely used and interpretable type of aggregate association test. A systematic review of 234 studies catalogued 664 masks and masking strategies that rarely repeated across studies. Analyzing 54 traits within 189,947 UK Biobank exomes, we show that the number of significant associations greatly depends on the masking strategy employed (ranging from 58 to 2,523 associations) and, consequently, separate published analyses of this dataset report minimally overlapping associations (<30%). By empirically determining mask combinations that maximize the number of significant associations, we propose masking strategies that detect twice as many significant low-frequency and rare variant associations as the "average" strategies previously employed, with consistent performance across many traits. Our analyses demonstrate the inconsistency of previously used variant masking strategies and provide a simple solution to increase power and replicability in future studies.

Year of Publication
2025
Journal
Research square
Date Published
04/2025
ISSN
2693-5015
DOI
10.21203/rs.3.rs-6322956/v1
PubMed ID
40321767
Links