Skip to main content

Official Journal of the Human Genome Organisation

  • Review Article
  • Open access
  • Published:

In silico investigations on functional and haplotype tag SNPs associated with congenital long QT syndromes (LQTSs)

Abstract

Single-nucleotide polymorphisms (SNPs) play a major role in the understanding of the genetic basis of many complex human diseases. It is still a major challenge to identify the functional SNPs in disease-related genes. In this review, the genetic variation that can alter the expression and the function of the genes, namely KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2, with the potential role for the development of congenital long QT syndrome (LQTS) was analyzed. Of the total of 3,309 SNPs in all five genes, 27 non-synonymous SNPs (nsSNPs) in the coding region and 44 SNPs in the 5′ and 3′ un-translated regions (UTR) were identified as functionally significant. SIFT and PolyPhen programs were used to analyze the nsSNPs and FastSNP; UTR scan programs were used to compute SNPs in the 5′ and 3′ untranslated regions. Of the five selected genes, KCNQ1 has the highest number of 26 haplotype blocks and 6 tag SNPs with a complete linkage disequilibrium value. The gene SCN5A has ten haplotype blocks and four tag SNPs. Both KCNE1 and KCNE2 genes have only one haplotype block and four tag SNPs. Four haplotype blocks and two tag SNPs were obtained for KCNH2 gene. Also, this review reports the copy number variations (CNVs), expressed sequence tags (ESTs) and genome survey sequences (GSS) of the selected genes. These computational methods are in good agreement with experimental works reported earlier concerning LQTS.

Introduction

Inherited mutations of ion channel proteins are prevalent, and the disorders caused by them, including epilepsy, febrile seizures, Dent’s disease and cardiac arrhythmias, are now referred to as channelopathies (Marban 2002; Jentsch 2000; Schwake et al. 2001; Lossin et al. 2002; Jurkat-Rott and Lehmann-Horn 2001; Kullmann 2002). Congenital LQTS is a genetically heterogeneous disorder associated with mutations in various cardiac ion channel genes that prolongs repolarization of the ventricular myocyte. The cardiac repolarization process is known to be strongly dependent on various parameters, such as heart rate (Bazett 1920), age (Reardon and Malik 1996), sex (Yang et al. 1994; Legato 2000), plasma levels of electrolytes (Nagasaka et al. 1972), medications (Kaab et al. 2003) and inherited and acquired pathological conditions (Tomaselli and Marban 1999). The molecular basis for LQTS is delayed repolarization of the myocardium, which prolongs the cardiac action potential, increasing the QT interval measured on the surface electrocardiogram. Mutations in five ion channel genes, namely KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2, cause the majority of cases of inherited LQTS. Recently, genetic approaches to understand diversity in cardiac function and susceptibility to cardiac arrhythmias have focused in particular on ion channels and gap junction proteins as key components in normal and abnormal cardiac electrophysiology.

One of the interests in association studies is the association between SNPs and disease development. There are millions of SNPs in the entire human genome, which creates major difficulty for planning costly population-based genotyping to target SNPs that are most likely to affect phenotypic functions and ultimately contribute to disease development. Single nucleotide polymorphism (SNP) markers are preferred for disease association studies because of their high abundance along with the human genome. But the current throughput of technology is inadequate for genotyping all the existing SNPs for a large number of samples. Thus, the selection of a maximally informative set of SNPs (tag SNPs) for genome-wide association studies has attracted much attention. Linkage disequilibrium (LD) patterns vary across the human genome, with some regions of high LD interspersed with regions of low LD. Such LD patterns make it possible to select a set of single nucleotide polymorphisms (SNPs; tag SNPs) for genome-wide association studies. Genome-wide association methods based on linkage disequilibrium (LD) offer a promising approach to detect genetic variation responsible for common human diseases. Several large-scale studies for dissecting LD patterns across the human genome based on SNPs have revealed that the LD patterns vary greatly across the human genome, with some regions of high LD interspersed with regions of low LD (Gabriel et al. 2002; Patil et al. 2001). In those high LD regions, which are referred to as blocks in the literature, only a small number of SNPs are sufficient to capture most of haplotype structure (Johnson et al. 2001; Patil et al. 2001).

Understanding the functions of single nucleotide polymorphisms (SNPs) can greatly help to understand the genetics of the human phenotype variation and especially the genetic basis of complex human diseases like long QT syndrome (Schork et al. 2000). Therefore, it is urgent to develop and apply methods to prioritize target SNPs. To date, at least five gene loci have been identified for the LQTS disorder, of which four encode for potassium channels, KCNQ1, KCNH2, KCNE1 and KCNE2, and one encodes for the sodium channel, SCN5A. KCNQ1, the gene responsible for causing LQT1, was mapped to chromosome 11p15.5 (Keating et al. 1991), KCNH2 to LQT2 locus on chromosome 7q35–36 (Curran et al. 1995), SCN5A to LQT3 locus on chromosome 3p21–24 (Jiang et al. 1994), KCNE1 to LQT5 locus on chromosome 21q22 (Barhanin et al. 1996) and KCNE2 to LQT6 locus on chromosome 21q22 (Abbott et al. 1999). LQT4 has been attributed to ankyrin B mutation (Mohler et al. 2003), and its locus was mapped to chromosome 4q25–27 (Schott et al. 1995). The ANKB gene, which encodes for Ankyrin-B protein to cause type 4 LQTS, is not included in this work since no polymorphisms in Ankyrin-B associated with LQTS have been reported so far. In this review, computationally predicted most deleterious non-synonymous SNPs (nsSNPs) in the coding regions and SNPs in the 5′ and 3′ un-translated regions (UTR) of all five genes are reported. Also, the results obtained through computation were compared with an experimental work reported earlier, and this led to the conclusion that there are many other deleterious SNPs that have to be worked out experimentally in the near future. Apart from these predictions, haplotype blocks, tag SNPs, copy number variations (CNVs), expressed sequence tags (ESTs) and genome survey sequence (GSS) information about the genes with the potential role for the development of congenital long QT syndrome (LQTS) are reported.

Distribution of total SNPs in the five selected genes

Detailed descriptions of polymorphisms and the respective mRNA sequences for the KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2 genes were obtained from the NCBI human genome protein sequence (Wheeler et al. 2006) and Swiss-Prot database (Yip et al. 2004). The information is given in Table 1. In total, 3,309 SNPs were found in all five genes. Among the 3,309 SNPs, 97 (3%) were coding non-synonymous SNPs (nsSNP), 113 were present in the 5′UTR 3′ UTR regions, 71 were coding synonymous SNPs (sSNP), and the rest of the 3,028 were in the intron and non-coding exon regions. Only coding non-synonymous SNPs (nsSNPs), 5′ and 3′ UTR SNPs were selected for the investigations. The distribution of total SNPs in the individual selected genes is shown in Fig. 1. It can be seen from Fig. 1 that the highest number of SNPs was present in the KCNQ1 gene, and the lowest number of SNPs was present in the KCNE2 gene. Figure 2 shows the distribution of nsSNPs and UTR SNPs (3′ and 5′ together) as a function of the five genes studied. It is interesting to note from this figure that, even though the total number of SNPs was much less in SCN5A gene compared to KCNQ1 (Fig. 1), the number of nsSNPs was much higher in SCN5A compared to KCNQ1. Also, the total UTR SNPs were found to be significantly higher for SCN5A and KCNE1 compared to the other three genes. No regular correlation could be observed among the total SNPs, nsSNPs and UTR SNPs present in all five genes. This prompted us to investigate the deleterious nature of the individual nsSNPs and also to determine the role of SNPs in the 5′ and 3′ UTR regions of the selected five ion channel genes using computational methods.

Table 1 Detailed information about the genes associated with congenital long QT syndrome
Fig. 1
figure 1

Distribution of total number of SNPs in the genes associated with LQTS

Fig. 2
figure 2

Total number of nsSNPs and UTR SNPs for genes associated with LQTS

Deleterious coding nonsynonymous SNPs found by the SIFT program

The SIFT program (Ng and Henikoff 2002) was used to detect the deleterious coding nonsynonymous SNPs. SIFT is a sequence homology-based tool that presumes that important amino acids will be conserved in the protein family. Hence, changes at well-conserved positions tend to be predicted as deleterious (Ng and Henikoff 2002). The query has to be submitted in the form of SNP IDs or as protein sequences. The underlying principle of this program is that SIFT takes a query sequence and uses multiple alignment information to predict tolerated and deleterious substitutions for every position of the query sequence. SIFT is a multi-step procedure that, given a protein sequence (a), searches for similar sequences (b), chooses closely related sequences that may share similar functions, (c) obtains the multiple alignment of the chosen sequences and (d) calculates normalized probabilities for all possible substitutions at each position from the alignment. Substitutions at each position with normalized probabilities less than a chosen cutoff are predicted to be deleterious and those greater than or equal to the cutoff are predicted to be tolerated (Ng and Henikoff 2001). The cutoff value in the SIFT program is a tolerance index of ≥0.05. The higher the tolerance index, the less functional impact a particular amino acid substitution is likely to have. Among the 97 coding non-synonymous SNPs in the selected five genes, 31 nsSNPs were deleterious, having the tolerance index score of ≤0.05. The results are shown in Table 2. According to the SIFT algorithm, five nsSNPs showed functionally significant scores, and the SNP with an id (rs45478697) showed a highly deleterious tolerance index score of 0.00 in the KCNQ1 gene. Likewise, in the KCNH2 gene, seven nsSNPs showed functionally significant scores, and the SNPs with ids (rs45607339, rs11538710 and rs731506) showed a highly deleterious tolerance index score of 0.00. In SCN5A gene, 11 nsSNPs have functionally significant scores, and the SNPs with ids (rs45600438, rs45589741, rs45546039 and rs6791924) showed a highly deleterious tolerance index score. In KCNE1, two nsSNPs with ids (rs45457092 and rs28933384) and in KCNE2, out of six nsSNP ids, four (namely rs45600841, rs35759083, rs16991654 and rs2234916) showed a highly deleterious tolerance index score of 0.00. The predictive power and accuracy of the SIFT program are 88.3–90.6% and 67.4–70.3% specificity and sensitivity, respectively, when tested with different datasets of human variants (Mathe et al. 2006).

Table 2 Summary of nsSNPs that were predicted to have functional significance by SIFT algorithm

Damaged nsSNP found by the PolyPhen algorithm

Analyzing the damaged coding nonsynonymous SNPs at the structural level is considered to be very important to understand the functional activity of the protein of concern. The PolyPhen algorithm (Ramensky et al. 2002) was used for this purpose. Input options for the PolyPhen server are protein sequence or SWALL database ID or accession number together with sequence position with two amino acid variants. The query has to be submitted in the form of protein sequence with mutational position and two amino acid variants. Sequence-based characterization of the substitution site, profile analysis of homologous sequences and mapping of the substitution site to a known protein three-dimensional structure are the parameters taken into account by the PolyPhen program to calculate the score. It calculates PSIC scores for each of the two variants and then computes the PSIC score difference between them. The higher the PSIC score difference, the higher the functional impact on a particular amino acid substitution is likely to have. SNPs can be characterized by the type of nucleotide change as well as the putative functional effect. Ninety-seven protein sequences of nsSNPs investigated in this work were submitted as an input to the PolyPhen program, and the results are shown in Table 3. A position-specific independent count (PSIC) score difference was computed for each one, and a PSIC score difference of 1.5 and above is considered to be damaging. From the PSIC score, it was found that 46 nsSNPs (with PSIC score difference above 1.500) might significantly affect the protein structure (Table 3). Interestingly, there was a significant correlation between the SIFT and PolyPhen approach. When these two approaches were used together, 5 nsSNPs in KCNQ1, 7 nsSNPs in KCNH2, 10 nsSNPs in SCN5A, 3 nsSNPs in KCNE1 and 6 nsSNPs in KCNE2 were predicted to be most deleterious. The nsSNPs scores found to have functional significance by both SIFT and PolyPhen are shown bold in Tables 2 and 3. The results obtained by these programs partially concur with the experimental works reported earlier. Notably, the SNP with an id rs179489 in KCNQ1 (Splawski et al. 1998), rs36210422 (Laitinen et al. 2000) and rs41313074 (Splawski et al. 2000) in KCNH2, rs28937316 (Splawski et al. 2000), rs6791924 (Yang et al. 2002) and rs28937318 (Smits et al. 2002) in SCN5A, rs28933384 (Schulze-Bahr et al. 1997) in KCNE1, and rs2234916 (Millat et al. 2006), rs16991654 (Millat et al. 2006) and rs35759083 (Millat et al. 2006) in KCNE2 gene reported through experimental work were also predicted to be the deleterious mutations by the SIFT and PolyPhen programs. There are a few other mutations, which are depicted as ‘Predicted in this work’ in Tables 2 and 3, that have not been reported experimentally so far, but must be worked out in the future.

Table 3 Summary of nsSNPs that were predicted to have functional significance by PolyPhen algorithm

Functional SNPs in un-translated regions (UTR) found by the FastSNP program

Recent studies show that SNPs have functional effects on protein structure by a single change in the amino acid (Cargill et al. 1999; Sunyaev et al. 2000) and on transcriptional regulation (Prokunina and Alarcn-Riquelme 2004; Prokunina et al. 2002). The Web-based algorithm FastSNP (Yuan et al. 2006) was used for predicting the functional significance of the 5′ and 3′ UTRs of the selected genes. The FastSNP program follows the decision tree principle with external Web service access to TFSearch, which predicts whether a noncoding SNP alters the transcription factor-binding site of a gene. The score will be given by this program on the basis of levels of risk with a ranking of 0, 1, 2, 3, 4 or 5. This signifies the levels of no, very low, low, medium, high and very high effect, respectively. Table 4 shows the list of SNPs in the 5′ untranslated region that are predicted to be functionally significant in the five selected genes. According to FastSNP, only five SNPs with ids (rs41315349, rs41314819, rs4131547, rs41315473 in KCNE1 and rs41260744 in KCNE2) have possible functional effects in the 5′ UTR regions of KCNE1 and KCNE2 genes. Among these five SNPs, the SNP with an id rs41315349 shows moderate to high levels of risk, and the remaining SNPs shows very low to medium levels of risk. No SNPs were predicted to have a functional effect by the FastSNP program in the remaining three genes. However, this algorithm did not find any functional significance for the 3′ UTR, and hence the UTR scan algorithm was used to check the functional significance in the 3′ and 5′un-translated regions.

Table 4 List of SNPs (UTR mRNA) predicted to be functionally significant by FastSNP

Functional SNPs in UTR found by the UTRscan

The 5′ and 3′ UTRs are involved in various biological processes, such as post-transcriptional regulatory pathways, stability and translational efficiency (Sonenberg 1994; Nowak 1994). The UTRscan program (Pesole and Liuni 1999) allows one to search the user-submitted sequences for any of the patterns collected in the UTR site. UTRsite is a collection of functional sequence patterns located in 5′ or 3′ UTR sequences. Briefly, two or three sequences of each UTR SNP that have a different nucleotide at an SNP position are analyzed by UTRscan, which looks for UTR functional elements by searching through user-submitted sequence data for the patterns defined in the UTRsite and UTR databases. If different sequences for each UTR SNP are found to have different functional patterns, that particular UTR SNP is predicted to have functional significance. The Internet resources for UTR analysis are UTRdb and UTRsite. UTRdb contains experimentally proven biological activity of functional patterns of UTR sequences from eukaryotic mRNAs (Pesole et al. 2002). The UTRsite has the data collected from UTRdb and also is continuously enriched with new functional patterns. The different patterns include 15-lipoxygenase differentiation control element (15-LOX-DICE) (Ostareck-Lederer et al. 1994, 1998; Ostareck et al. 1997), the internal ribosome entry site (IRES) (Le and Maizel 1997), the GY box (Lai et al. 2000), alcohol dehydrogenase 3′UTR downregulation control element (ADH_DRE) (Parsch et al. 1999, 2000), cytoplasmic polyadenylation element (CPE) (Vassalli and Stutz 1996; Verrotti et al. 1996) and terminal oligopyrimidine tract (TOP) (Kato et al. 1994; Levy et al. 1991; Kaspar et al. 1992; Meyuhas et al. 1996). Polymorphisms in the 3′ UTR affect gene expression by affecting the ribosomal translation of mRNA or by influencing the RNA half-life (Van Deventer 2000). The UTRscan program results are depicted in Table 5. There were 113 SNPs in the UTR regions of the five selected genes, out of which 71 were in the 3′ UTR and 42 were in the 5′ UTR regions. UTRscan was applied to prioritize 113 UTR region SNPs. It found 44 of them to have different patterns for each sequence, which were predicted to have functional significance. Among the 44 UTR region SNPs, 18 were present in the 5′ UTR region, and 26 were present in the 3′ UTR region. Also, 31 of them (Table 5) were related to the functional pattern change of 15-LOX-DICE; 6 functional SNPs out of 44 were related to the functional pattern change of IRES, 2 of them were related to the functional pattern change of GY-Box and 5 of them related to the functional pattern change of ADH_DRE, CPE, TOP, K-Box and Brd-Box, respectively.

Table 5 Summary of nsSNPs that were predicted to have functional significance by UTR scan algorithm

Selection of haplotype tag SNPs

Haplotypes are common single nucleotide polymorphisms (SNPs) that have important implications for mapping of disease genes and human traits. Often only a small subset of the SNPs is sufficient to capture the full haplotype information. Such subsets of markers are called haplotype tagging SNPs (htSNPs). The HapMap website at http://www.hapmap.org is the primary portal to genotype data produced as part of the International HapMap Project (Gibbs et al. 2003). The Haploview program (Barrett et al. 2005) was used for analyzing the number of haplotype blocks and selecting the haplotype tag SNPs. Haploview is a tool for the selection and evaluation of tag SNPs from genotype data, such as those from the International HapMap Project. It combines the simplicity of pairwise tagging methods with the efficiency benefits of multimarker haplotype approaches. The genotype data of all the five genes, namely KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2, have to be uploaded in raw HapMap format to the Haploview program, and the linkage disequilibrium patterns and number of haplotype blocks in each gene can be calculated. The two most common measures are the absolute value of D’ and r2. The absolute value of D’ is determined by dividing D by its maximum possible value for the given allele frequencies at two loci. The case of D’ = 1 is known as complete LD, and the values of D’ < 1 indicate that the complete ancestral LD has been disrupted. The magnitude of values of D’ < 1 has no clear interpretation (Lewontin 1964; Hill and Robertson 1968; Nachman 2002). Estimates of D’ are strongly inflated in small samples. Therefore, statistically significant values of D’ near one provide a useful indication of minimal historical recombination, but intermediate values should not be used to compare the strength of LD between studies or to measure the extent of LD. The measurement of r2 is in some ways complementary to D’. r2 is equal to D2 divided by the product of the allele frequencies at the two loci. r2 > 0.9 is complete linkage disequilibrium, and <0.9 has no significance. Hill and Robertson (1968) deduced that E [r2] = 1/1 + 4Nc, where c is the recombination rate between the two markers, and N is the effective population size. This equation illustrates two important properties of LD. First, expected levels of LD are a function of recombination. The more recombination between two sites, the more they are shuffled with respect to one another, decreasing LD. Second, LD is a function of N, emphasizing that LD is a property of populations. To arrive at this equation, Hill and Roberson assumed that the population was an “ideal,” large, random-mating population without natural selection and mutation. Another approach for quantifying LD is through the population recombination parameter 4N e c(ρ). This approach avoids reliance on pairwise measures of LD, which differ from marker to marker, and facilitates comparisons between regions. It shows considerable promise for quantifying the strength of LD in a region. The number of haplotype blocks, haplotype tag SNPs and their D, r2 values in the genes associated with congenital long QT syndrome are depicted in Table 6. Among the five selected genes, KCNQ1 has the highest number of haplotype blocks with 26, as well as the maximum number of 6 tag SNPs with complete D’ and r2 values. Among the six tag SNPs, rs10798 nx rs8234 SNPs were present in the 3′ UTR and predicted to be deleterious SNPs by SIFT and Polyphen programs also (Tables 2, 3). Remaining tagSNPs in KCNQ1 are present in the intron regions. The gene SCN5A has ten haplotype blocks and four tag SNPs, out of which two tag SNPs, namely rs6795580 and rs6599229, have the ancestral LD disrupted values, and the other SNPs, rs7427106 and rs6768664, have complete D’ and r2 values. All four tag SNPs are present in the intron regions. Since both KCNE1 and KCNE2 genes are present in the same 21st chromosome, similar haplotype block and tag SNPs were obtained. Out of four tag SNPs, two are present in the intron region, and the other two are present in the 3′ UTR. The SNPs rs2834485 and rs11702354 show complete LD values, and rs9305548 and rs9984281 have the ancestral LD disrupted values. Four haplotype blocks and two tag SNPs, namely rs3807375 and rs2072413, are present in the KCNH2 gene.

Table 6 Number of haplotype blocks and haplotype tag SNPs in the genes associated with congenital long QT syndrome (LQTS)

Copy number variation assessment

Copy number variation (CNV) assessment should now become standard in the design of all studies of the genetic basis of phenotypic variation, including disease susceptibility. CNV in the human genome takes many forms, ranging from large, microscopically visible chromosome anomalies to single nucleotide changes. Recently, multiple studies have discovered an abundance of submicroscopic copy number variations of DNA segments ranging from kilobases (kb) to megabases (Mb) in size (Iafrate et al. 2004; Sebat et al. 2004; Sharp et al. 2005; Tuzun et al. 2005). Deletions, insertions, duplications and complex multi-site variants (Fredman et al. 2004), collectively termed copy number variations (CNVs) or copy number polymorphisms (CNPs), are found in all humans (Feuk et al. 2006) and other mammals (Freeman et al. 2006) as well. Copy number variation (CNV) of DNA sequences is functionally significant, but has yet to be fully ascertained. A CNV can be simple in structure, such as tandem duplication, or may involve complex gains or losses of homologous sequences at multiple sites in the genome. Most CNVs are benign variants that will not directly cause disease. However, there are several instances where CNVs that affect critical developmental genes do cause disease. Since the discovery of CNVs is so new, bioethics studies are just now underway. Compared to other genetic variants, CNVs are larger in size and can often involve complex repetitive DNA sequences. They can also encompass entire genes, many of which have a specific function ascribed to them. For these reasons CNV data could potentially be more amenable to misinterpretation. Some CNVs could be employed to add discrimination power in forensics, but typing them is usually less efficient than other types of genetic markers. As with all types of genetic variation, CNVs can vary in frequency and occurrence between populations. As a result of recent common origin, the vast majority of copy-number variations—around 89%—is shared among the diverse human populations studied. Copy number variations (CNVs) can be retrieved from the Database of Genomic Variants. There were three copy number variants, namely Variation_3710, Variation_22718 and Variation_38031, which are present at the cytogenetic band of 7q36.1 in KCNH2 gene. The genes KCNE1 and KCNE2 are present in the same 21st chromosome and have two variants, namely Variation_34534 and Variation_26957, at the cytogenetic bands between 21q22.11 and 21q22.12. The variants Variation_29897 and Variation_36146 are present in KCNQ1 and SCN5A genes at the cytogenetic bands of 11p15.5 and 3p22.2, respectively. The details of copy number variations are depicted in Table 7.

Table 7 Copy number variations (CNVs) in the genes associated with congenital long QT syndrome (LQTS)

Expressed sequence tag and genome survey sequence database screening

The human expressed sequence tag (EST) database provides a wealth of resources, which can be used to rapidly screen for potential polymorphisms in proteins of physiological interest. The human expressed sequence tags (ESTs) database consists of >3,700,000 entries of partial cDNA sequences. These sequences have been generated from many different tissues and are derived from a range of individuals. ESTs can reflect a part or all of the transcribed sequence of a gene, which includes the coding sequences as well as the 5′ and 3′ un-translated regions (UTRs). Currently, the ESTs database is accessible online from the website of the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/dbEST/). Database screening can be performed using gapped BLAST programs (Ulrich et al. 2000), which are obtainable from the homepage of the NCBI (Altschul et al. 1997). The genome survey sequence (GSS) division of GenBank is similar to the EST division, with the exception that most of the sequences are genomic in origin, rather than cDNA (mRNA). It should be noted that two classes (exon trapped products and gene trapped products) may be derived via cDNA intermediate. Although dbGSS sequences are incorporated into the GSS Division of GenBank, annotation in dbGSS is more comprehensive and includes detailed information about the contributors, experimental conditions and genetic map locations. The EST and GSS data were retrieved from dbEST database and dbGSS database, respectively, for the selected five genes, and the information is given in Table 8.

Table 8 Database information on expressed sequence tag (EST) and genome survey sequences (GSS) of genes associated with congenital long QT syndrome (LQTS)

Summary and conclusions

The congenital long QT syndrome is a potentially life-threatening condition caused by mutations in genes encoding cardiac ion channels. The genes, namely KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2, encoding cardiac ion channels with a potential role for the cause of LQTS were investigated by evaluating the influence of functional SNPs through computation methods. Although the literature survey showed that there is a wide range of material on these genes related to LQTS, there have been no computational studies undertaken for an investigation of the nsSNP mutations. Of the total 3,309 SNPs in all five genes, 27 nsSNPs by SIFT and PolyPhen programs, and 44 SNPs by FastSNP and UTR Scan programs were found to have functional significance. Of 27 functionally significant nsSNPs in the coding region, 5 nsSNPs belong to the KCNQ1 gene, 7 belong to the KCNH2 gene, 10 belong to the SCN5A gene, 2 belong to the KCNE1 gene, and 3 belong to the KCNE2 gene. In 44 functionally significant SNPs in the 5′ and 3′ UTR regions, 14 belong to the KCNQ1 gene, 7 belong to the KCNH2, 11 belong to the SCN5A, 11 belong to the KCNE1, and 1 belongs to the KCNE2 gene. Among the five selected genes, KCNQ1 has the highest number of haplotype blocks of 26 and 6 tag SNPs with a complete LD value of 1.0. Among the six tag SNPs, rs10798 and rs8234 SNPs were also predicted to be deleterious SNPs by the SIFT and Polyphen programs. The gene SCN5A has ten haplotype blocks and four tag SNPs. Both KCNE1 and KCNE2 genes have similar haplotype block and tag SNPs. Two tag SNPs and four haplotype blocks were obtained for the KCNH2 gene. Results on these five ion channel genes provide excellent insight into the disease causing functional and haplotype tag SNPs related to LQTS. The reported data indicate that bioinformatic tools are indeed useful in predicting the functional impact of SNPs. Although these algorithms have been developed based on empirical data, correlations between predictive scores and findings from human studies have not been explored, with the exception of SIFT. The results of this study have therefore provided novel evidence of the correlation using human data, in turn facilitating genotyping efforts in future molecular epidemiological studies and providing targets for phenotypic analysis of genetic variants. These results can also be used to refine the bioinformatic algorithms. Also, these findings warrant a more comprehensive approach and more available bioinformatic tools in future analyses.

Genetic polymorphisms in the human population have been studied in order to gain insight into their influence on the activity of specific genes involved in disease susceptibility. Finding previously unknown polymorphisms has often relied on the detection of a related phenotype. This is a time-consuming task, usually requiring months or years at the bench to identify a novel polymorphism. Moreover, many polymorphisms may exist in the human genome that have not been identified and characterized because of problems of methodology. The use of computational algorithms and human genomic variation databases to find novel genetic polymorphism provides an alternative opportunity to investigate the consequences of polymorphism on the gene and the protein activity. The genetic screening of symptomatic patients or asymptomatic family members may identify patients at risk for life-threatening congenital long QT syndromes. More specifically, the mutation carriers without symptoms or ECG characteristics of the congenital long QT syndrome are at great risk. The family members considered normal on clinical and ECG grounds could be silent gene carriers displaying a very mild phenotype. They would be unexpectedly at risk for generating affected offspring and also for developing arrhythmias if exposed to either cardiac or noncardiac drugs that block potassium channels. Molecular screening is therefore recommended in all family members of positively genotyped patients. This report will surely help for such molecular screening work. Also, recent studies suggest that genotype-specific treatment of the congenital long QT syndrome will be feasible in the near future. These results, based on the application of computational tools, such as SIFT, PolyPhen, FASTSNP, UTR Scan and Haploview analysis, might provide an excellent approach to selecting target SNPs in genotype-specific treatment of congenital long QT syndrome. Also, the applications of these computational algorithms in association studies will greatly strengthen the understanding of inheritance of complex human phenotypes. Therefore, this kind of analysis will provide useful information in selecting SNPs that are likely to have potential functional impact and ultimately contribute to an individual’s susceptibility to LQTS by the KCNQ1, KCNH2, SCN5A, KCNE1 and KCNE2 genes.

References

Download references

Acknowledgments

The authors thank the management of Vellore Institute of Technology, Vellore, for providing the facilities to carry out this work. The authors also heartily thank the Editor-in-chief and the reviewers for their valuable suggestions in the improvement of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rao Sethumadhavan.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Sudandiradoss, C., Sethumadhavan, R. In silico investigations on functional and haplotype tag SNPs associated with congenital long QT syndromes (LQTSs). HUGO J 2, 55–67 (2008). https://doi.org/10.1007/s11568-009-9027-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11568-009-9027-3

Keywords