MarkerMatch: a Proximity-Based Probe-Matching Algorithm for Joint Analysis of Copy-Number Variants from Different Genotyping Arrays.
| Authors | |
| Abstract | MOTIVATION: Copy-number variants (CNVs) are a form of genetic structural variation with increasing importance in complex human disorders. Both DNA sequencing and microarray data can be used to detect CNVs, which can be used in genetic association tests. Unlike genotypes, CNV detection in microarrays requires the use of observed intensity signals at each probe, which limits the imputability for analyses that span multiple array types. Thus far, a consensus set of probes (those present on all arrays) has been used to circumvent the problem of differing array-specific sensitivities. This has led to excessive reduction in overall sensitivity since arrays can have an undesirably low probe overlap. To overcome this limitation, we developed MarkerMatch, a proximity-based algorithm that matches probes across different genotyping microarrays to maximize the number of probes considered in the CNV calling algorithm, thereby increasing the resolution and sensitivity while preserving precision.RESULTS: By analyzing CNV calls from 4,906 individuals genotyped across three different arrays, we show that the MarkerMatch approach improves sensitivity by increasing the density of probes available for CNV calling while maintaining precision or improving it relative to the current practice (e.g., use of consensus probes only). We further demonstrate that MarkerMatch matches the CNV detection from current practice in terms of F1 score and PPV for larger CNVs. We also optimize MarkerMatch parameters, DMAX and Method, and find an optimal DMAX setting at 10 kb, with no clear optimal candidate based on Method, indicating that parameters for this metric should be determined on a use case basis.AVAILABILITY: The R package for MarkerMatch is available at: . The code used for analysis and implementation is available at: . The live notebook is available at .SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
| Year of Publication | 2026
|
| Journal | Bioinformatics (Oxford, England)
|
| Date Published | 05/2026
|
| ISSN | 1367-4811
|
| DOI | 10.1093/bioinformatics/btag341
|
| PubMed ID | 42178371
|
| Links |