Home ~ Intro ~ Project ~ Script ~ Results ~ Conclusions ~ Links SNPs distribution in genes
Last Update: 21 March 2002

  1. Published Results
  2. Our Results
  3. SNPs and diseases



Published Results

The International SNP Map Working Group published on February 2001 (Nature 2001 Feb 15;409(6822):928-33) they had found 1.42 million SNPs distributed throughout the human genome. If we consider the 2.7 Gygabases on the genome, it makes an average of one SNP every 1.9 kilobases.

They also calculated that SNPs are also located close one to another, at high density: 90% of continuos 20 Kb sequence contain one or more SNP, as do 63% of 5 Kb and 28% of 1 Kb. Only 4% of genome sequence fall in gaps between SNPs of more than 80 Kb (and some of them are still being mapped).

However, they are not homogeneously distributed. To evaluate the density of SNPs in regions within and surrounding genes, they used 7000 non redundant mRNAs and they estimated 2 exonic SNPs per gene (coding and untranslated regions). Considering the approximately 30000 genes of the human genome, it makes an average of one per 1.08 Kb: highest than the genome as a whole.





Our Results

The following links lead to results for individual chromosome. Whole results and the summary for each chromosome are avalaible.

Chromosome 1

Results     Summary

Chromosome 2

Results     Summary

Chromosome 3

Results     Summary

Chromosome 4

Results     Summary

Chromosome 5

Results     Summary

Chromosome 6

Results     Summary

Chromosome 7

Results     Summary

Chromosome 8

Results     Summary

Chromosome 9

Results     Summary

Chromosome 10

Results     Summary

Chromosome 11

Results     Summary

Chromosome 12

Results     Summary

Chromosome 13

Results     Summary

Chromosome 14

Results     Summary

Chromosome 15

Results     Summary

Chromosome 16

Results     Summary

Chromosome 17

Results     Summary

Chromosome 18

Results     Summary

Chromosome 19

Results     Summary

Chromosome 20

Results     Summary

Chromosome 21

Results     Summary

Chromosome 22

Results     Summary

X-Chromosome

Results     Summary

Y-Chromosome

Results     Summary



SNPs AND DISEASES

As we have said before, an SNP could cause a defective splicing in case of being located in a donor or acceptor site. However, the low prevalence of a concrete SNPs makes difficult to determine which diseases can be caused by this.

Searching for more information, we have found a lot of difficulties while correlating SNPs and disease, because data bases do not display this kind of relationship or they are not refreshed with the newest data.

On one hand, in the OMIM can be found exemples of variation of one single nucleotide that cause disease, but they have no correlation with the SNPs at NCBI (the ones we have worked with). Moreover, some of them do have with the Celera sequenciation center data, but the results can not be overlaped: there is a need of a global SNPs data set.

If we look up, for example, the gene RUNX1, in the OMIM allelic variants we find a disease caused by an SNP in the splice site of intron 3, but it is impossible to find the SNP causing it wherever but in The Human Gene Mutation Database, in association with Celera and with an identificator that can not be used in NCBI.

On the other hand we have searched SNPs at the splicing site of CFTR (cystic fibrosis transmembrane conductance regulator), located at the chromosome 7 and responsible of cystic fibrosis. Correlating all the known mutations that cause the disease and NCBI SNPs we have found a lot of SNPs in this gene and also defects in splicing sites due to one nucleotide change that cause the disease, but it is difficult to compare both web sites because the sequences are not the same. For exemple, one of the SNPs we have found in the chromosome 7 (at the NCBI rs1800094) which corresponds to a G to A change, we think can be located in intron 10 of CFTR gene, involving a splicing defect. However, it is difficult to generalize and find global or prevalent examples.

It can be assumed that SNPs in our source database represent the pool of the most prevalent SNPs in the population. It is clear that these SNPs could hardly be responsible of disease while being so prevalent they would cause a widespread number of disease cases. Otherwise, we found a great number of SNPs in splicing sites for the cystic fibrosis gene. There are a lot of known mutations that cause CFTR gene disfunction. We suspect that SNPs in CFTR splicing sites even not being determinant for disease may contribute to the susceptibility to develop this illness.