Title: Novel Family-control Analysis Can Highly Prioritize Sequence Variants in Familial Cardiomyopathy
Abstract: Detecting causative genes for cardiomyopathy among hundreds of candidate variants is still challenging, especially in small pedigrees where conventional linkage analysis is underpowered or even uninformative. To overcome this problem, we developed a novel approach, combining family data with population controls. Not knowing the true disease-causing variant, the idea is that a set of variants surrounding a true disease locus should have different frequencies in affected family members and unaffected controls. Thus, we measure the "distance" between sets of variants in affected and unaffected individuals by using the Hamming distance. For a set of basepairs flanking a candidate locus, our Hamming Distance Ratio(HDR)is the proportion of basepairs differing between two individuals. We calculate HDR for all pairs of individuals and distinguish pairs containing a case and control individual from those containing two control individuals. We assess the difference in mean HDR values between these two types of pairs by the t statistic.In two hypertrophic cardiomyopathy families, known pathogenic mutations were previously detected, c.173G>A(p.R58Q) in the MYL2 gene (rs104894369, family A), and c.746G>A(p.R249Q) in the MYH7 gene (rs3218713, family B). We are able to narrow down known disease variants to be in the top 3% of all candidate variants.Our new statistical method for prioritizing disease regions will be useful for small autosomal dominant pedigrees. Detecting causative genes for cardiomyopathy among hundreds of candidate variants is still challenging, especially in small pedigrees where conventional linkage analysis is underpowered or even uninformative. To overcome this problem, we developed a novel approach, combining family data with population controls. Not knowing the true disease-causing variant, the idea is that a set of variants surrounding a true disease locus should have different frequencies in affected family members and unaffected controls. Thus, we measure the "distance" between sets of variants in affected and unaffected individuals by using the Hamming distance. For a set of basepairs flanking a candidate locus, our Hamming Distance Ratio(HDR)is the proportion of basepairs differing between two individuals. We calculate HDR for all pairs of individuals and distinguish pairs containing a case and control individual from those containing two control individuals. We assess the difference in mean HDR values between these two types of pairs by the t statistic. In two hypertrophic cardiomyopathy families, known pathogenic mutations were previously detected, c.173G>A(p.R58Q) in the MYL2 gene (rs104894369, family A), and c.746G>A(p.R249Q) in the MYH7 gene (rs3218713, family B). We are able to narrow down known disease variants to be in the top 3% of all candidate variants. Our new statistical method for prioritizing disease regions will be useful for small autosomal dominant pedigrees.