Human pre-mRNA splicing signals.

Journal of theoretical biology

PubMedID: 1798333

Penotti FE. Human pre-mRNA splicing signals. J Theor Biol. 1991;150(3):385-420.
A sample of 764 pairs of human pre-mRNA exon-intron and intron-exon boundaries, extracted from the European Molecular Biology Laboratory data bank, is analyzed to provide a species-optimized characterization of donor and acceptor sites, evaluate the information content of the two signals (found to be about 8 and 9 bits respectively) and check the independent-base approximation (which holds well) and the "GT-AG" rule (to which, a few well-documented exceptions are found). No correlation is detected between the strength ("discrimination energy") of an actual donor-site signal and that of its corresponding acceptor-site counterpart, nor between that of either signal, or the cumulative strength of both, and the length of the intervening intron. The discrimination-energy distributions of the two signals are determined. Because of the large sample size and its single-species origin, the two distributions can be presumed to be representative of their underlying genomic counterparts. The size distribution of the introns shows a lower cut-off of 70 nucleotides (in essential agreement with published experimental results), and apparently no periodicities. A smaller sample of mammalian branch sites, taken from the literature, is similarly analyzed to attempt a characterization of this rather elusive signal, and provides some indication that at least part of the "long pyrimidine stretch", usually considered an integral constituent of the 3' splice signal, may be just as strongly associated with the branch site, in agreement with recent experimental observations. The usefulness of these characterizations for splice-junction searches is assessed on a test sequence.