Empirically determined baseline masking strategies and other considerations for gene-level burden tests.
| Authors | |
| Abstract | Rare-variant association studies typically perform gene-level tests in which coding variants are filtered (or 'masked') and aggregated based on functional annotation and allele frequency. Through a systematic literature review, we cataloged 664 masks used across 234 studies and found that masking strategies (that is, sets of masks) rarely repeat across studies and are rarely justified. To quantify their impact on association results, we applied all previously employed strategies to 54 traits within 189,947 UK Biobank exomes. Here we find that the number of significant associations greatly depends on the masking strategy (ranging from 58 to 2,523 associations), which is a key reason for the modest overlap (<30%) of associations between separate published analyses of this dataset. We empirically determine masking strategies with high discovery power for low-frequency and rare variant gene-level associations across numerous datasets and traits, and we use these to explore the impact of other factors on burden test results. These findings offer a baseline strategy in burden tests to increase study power and replicability, addressing one source of inconsistency in previous studies. |
| Year of Publication | 2026
|
| Journal | Nature genetics
|
| Date Published | 05/2026
|
| ISSN | 1546-1718
|
| DOI | 10.1038/s41588-026-02597-9
|
| PubMed ID | 42104092
|
| Links |