<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1745-6150-3-30</ui>
   <ji>1745-6150</ji>
   <fm>
      <dochead>Discovery notes</dochead>
      <bibl>
         <title>
            <p>Accumulation of GC donor splice signals in mammals</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Churbanov</snm>
               <fnm>Alexander</fnm>
               <insr iid="I1"/>
               <email>atchourbanov@lumc.edu</email>
            </au>
            <au id="A2">
               <snm>Winters-Hilt</snm>
               <fnm>Stephen</fnm>
               <insr iid="I2"/>
               <email>winters@cs.uno.edu</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Koonin</snm>
               <mi>V</mi>
               <fnm>Eugene</fnm>
               <insr iid="I3"/>
               <email>koonin@ncbi.nlm.nih.gov</email>
            </au>
            <au id="A4">
               <snm>Rogozin</snm>
               <mi>B</mi>
               <fnm>Igor</fnm>
               <insr iid="I3"/>
               <email>rogozin@ncbi.nlm.nih.gov</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Loyola University Medical Center, 2160 S. First Ave., Maywood, IL, 60153, USA</p>
            </ins>
            <ins id="I2">
               <p>Department of Computer Science, University of New Orleans, New Orleans, LA, 70148, USA</p>
            </ins>
            <ins id="I3">
               <p>National Center for Biotechnology Information NLM, National Institutes of Health, Bethesda, MD, 20894, USA</p>
            </ins>
         </insg>
         <source>Biology Direct</source>
         <issn>1745-6150</issn>
         <pubdate>2008</pubdate>
         <volume>3</volume>
         <issue>1</issue>
         <fpage>30</fpage>
         <url>http://www.biology-direct.com/content/3/1/30</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">18613975</pubid>
               <pubid idtype="doi">10.1186/1745-6150-3-30</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>07</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>09</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>09</day>
               <month>7</month>
               <year>2008</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2008</year>
         <collab>Churbanov et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p/>
               </st>
               <p>The GT dinucleotide in the first two intron positions is the most conserved element of the U2 donor splice signals. However, in a small fraction of donor sites, GT is replaced by GC. A substantial enrichment of GC in donor sites of alternatively spliced genes has been observed previously in human, nematode and Arabidopsis, suggesting that GC signals are important for regulation of alternative splicing. We used parsimony analysis to reconstruct evolution of donor splice sites and inferred 298 GT > GC conversion events compared to 40 GC > GT conversion events in primate and rodent genomes. Thus, there was substantive accumulation of GC donor splice sites during the evolution of mammals. Accumulation of GC sites might have been driven by selection for alternative splicing.</p>
            </sec>
            <sec>
               <st>
                  <p>Reviewers</p>
               </st>
               <p>This article was reviewed by Jerzy Jurka and Anton Nekrutenko. For the full reviews, please go to the Reviewers' Reports section.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="endnote"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Findings</p>
         </st>
         <p>In vertebrates, most of the protein-coding genes are interrupted by multiple introns that are removed at the donor and acceptor splice sites so that the adjacent exons are spliced. This process is mediated by an elaborate molecular machine, the spliceosome that consists of 5 snRNPs (small nuclear ribonucleoprotein particles) along with numerous less stably associated proteins, and is conserved throughout the eukaryotic world <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. The U2 spliceosome (the major eukaryotic spliceosome) interacts with specific parts of the intron and the flanking exons to ensure accurate and efficient splicing <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The nucleotides at the intron termini and the adjacent nucleotides in the exons are involved in these interactions and comprise the splicing signal. The (A/C)AG|<ul>GT</ul>(A/G)AGT consensus sequence (the exon|intron boundary is shown by the vertical streak and the first two nucleotides of the intron are underlined) at the donor splice signal is complementary to the 5' end of U1 snRNA, and this interaction is believed to be the major requirement for splicing <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>.</p>
         <p>The GT dinucleotide in the first two intron positions is the most conserved element of the U2 donor splice signal. However, in a small fraction of donor sites (&lt;1%), GT is replaced by GC; in these cases, the rest of the nucleotides in the donor signal adhere more closely to the consensus sequence, apparently, compensating for the T to C substitution that is unfavorable for splicing <abbrgrp><abbr bid="B8">8</abbr><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr></abbrgrp>. This rare class of donor splice signals has been implicated in alternative splicing <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. For example, the conserved C at the +2 position of the 10<sup>th </sup>intron of the let-2 gene which encodes one of the collagen isoforms is essential for developmentally regulated alternative splicing in the nematode <it>C. elegans</it>. Replacement of the GC donor signal with a moderate or strong GT signal abolishes splicing regulation and leads to excessive usage of exon 10 of let-2 in embryos <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Generally, a substantial enrichment of GC donor signals in alternatively spliced genes has been observed in human, <it>C. elegans </it>and Arabidopsis <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>.</p>
         <p>Pairwise comparisons of GC splicing signals in the nematodes <it>Caenorhabditis elegans </it>and C. <it>briggsae </it>suggested that GC donor signals are not evolutionarily conserved in nematodes: among the 26 <it>C. elegans </it>GC-AG introns, only 5 had a GC-AG counterpart in C. <it>briggsae </it><abbrgrp><abbr bid="B11">11</abbr></abbrgrp>. Frequent switching between GT-AG and GC-AG introns has been reported for 5 vertebrate genomes <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>. We were interested in exploring the genome-wide evolutionary dynamics of the donor splice sites and, in particular, sought to determine whether there might be a trend toward depletion or accumulation of GC.</p>
         <p>Genomic alignments of 8 vertebrates (chicken, opossum, dog, cow, rhesus macaque, human, mouse and rat) were extracted from the UCSC genome browser (Additional file <supplr sid="S1">1</supplr>) <abbrgrp><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp> and used to map cases of GC > GT and GT > GC conversion to the branches of the mammalian phylogeny (Figure <figr fid="F1">1</figr>). Terminal leaves of the tree (individual genomes) are more vulnerable to the effects of sequencing errors and/or population polymorphism than internal branches <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Therefore, only those sites were analyzed in which the GC or GT donor signal was shared by at least two sister taxa (e.g., we assign the GC signature to the rodent clade when GC was found in both mouse and rat sequences). It was further required that either GT or GC signal was conserved in all outgroup species for which an alignment was available at the given site (the presence of the dog and cow sequences was unconditionally required). Altogether, there were 122,621 and 253 invariant GT and GC donor signals, respectively, and 656 variable sites of which 338 mapped to internal branches and were employed for the analysis of evolutionary dynamics.</p>
         <suppl id="S1">
            <title>
               <p>Additional file 1</p>
            </title>
            <text>
               <p>Materials and Methods.</p>
            </text>
            <file name="1745-6150-3-30-S1.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
         <fig id="F1">
            <title>
               <p>Figure 1</p>
            </title>
            <caption>
               <p>Parsimony reconstruction of GT > GC (red) and GC > GT (green) donor signal conversion events mapped on the phylogenetic tree of 8 vertebrate species</p>
            </caption>
            <text>
               <p><b>Parsimony reconstruction of GT > GC (red) and GC > GT (green) donor signal conversion events mapped on the phylogenetic tree of 8 vertebrate species</b>. The tree topology that includes the primate-rodent clade was from <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>.</p>
            </text>
            <graphic file="1745-6150-3-30-1"/>
         </fig>
         <p>The GC>GT and GT>GC conversion events were reconstructed using maximum parsimony (Figure <figr fid="F1">1</figr>). Unexpectedly, we observed a pronounced excess of GT>GC conversion over GC>GT conversion that is indicative of accumulation of GC donor splice sites in both primate and rodent genomes (Figure <figr fid="F1">1</figr>). The trend is stronger in the rodent lineage than in the primate lineage (Figure <figr fid="F1">1</figr>), an observation that is consistent with the overall fast genome evolution in rodents <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The apparent accumulation of GC donor signals was further supported by the analysis of the terminal branches of the tree although the excess of GC>GT conversions in macaque compared to human (Figure <figr fid="F1">1</figr>) could be caused by sequencing errors and/or population polymorphism. The observed excess of GT>GC conversions was robust with respect to the composition of the outgroup species set (Additional file <supplr sid="S2">2</supplr>).</p>
         <suppl id="S2">
            <title>
               <p>Additional file 2</p>
            </title>
            <text>
               <p>Parsimony reconstruction of GT > GC and GC > GT donor signal conversion events for different sets of outgroup species.</p>
            </text>
            <file name="1745-6150-3-30-S2.doc">
               <p>Click here for file</p>
            </file>
         </suppl>
         <p>The observed excess of GT>GC conversions hardly can be explained by a nucleotide substitution bias. It has been repeatedly shown that mammalian genomes have a tendency to become more AT-rich <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>. Even if one assumes that, due to unknown reasons, this trend is reversed in the donor sites so that T to C substitutions are twice as frequent as C to T substitutions, such a bias would not account for the observed excess of GT>GC conversion events (P &lt; 10<sup>-10 </sup>according to the &#967;<sup>2 </sup>test for pooled conversion events).</p>
         <p>Considering that mutational bias did not seem to be a plausible cause of the observed accumulation of GC donor sites, it seems most likely that this trend has to do with the involvement of GC sites in alternative splicing that is widespread and essential in mammals <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr></abbrgrp>. As GT>GC conversion can substantially alter the pattern of alternative splicing <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, these changes might become beneficial and eventually would be fixed in the population. Thus, positive selection could be a plausible explanation for the observed accumulation of GC in donor sites. However, an even more plausible scenario would involve evolution of a strong splice site context that would allow neutral fixation of GC sites. The neutrally fixed GC sites, then, could be recruited for alternative splicing and thus would become subject to purifying selection forbidding the reverse GC>GT conversion.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>AC performed sequence comparisons, SWH contributed to the interpretation of the results, AC and IBR contributed to the analysis of the results and wrote the initial draft of the manuscript, IBR and EVK incepted the study, contributed to the analysis of the results and wrote the final manuscript, all authors edited and approved the final version.</p>
      </sec>
      <sec>
         <st>
            <p>Reviewers' comments</p>
         </st>
         <sec>
            <st>
               <p>Reviewer's report 1: Jerzy Jurka, Genetic Information Research Institute</p>
            </st>
            <p>This paper reports the relatively straightforward observation that GC donor splice sites tend to accumulate during evolution of mammals. However, the suggestion of selection for alternative splicing would be more convincing if the authors could demonstrate the GC accumulation separately in AT-rich and GC-rich genomic regions in mammals, where the dynamics of GT replacement by GC may be different.</p>
            <sec>
               <st>
                  <p>Authors' response</p>
               </st>
               <p><it>We appreciate the suggestion that the dynamics of GC accumulation could depend on the base composition in the respective regions of the mammalian genomes</it>.</p>
               <p><it>So we performed a crude comparison of the occurrence of GC donor sites and the rates of donor site conversion in AT-rich and GC-rich regions</it>. Table <tblr tid="T1">1</tblr><it>shows that, although there was a statistically significant excess of GC donor sites in GC-rich regions, there was only a marginal, not statistically significant difference between the corresponding rates of donor site conversions. Thus, it appears that the accumulation of GC donor sites described here does not strongly depend on the nucleotide composition of the corresponding genomic regions</it>.</p>
               <tbl id="T1">
                  <title>
                     <p>Table 1</p>
                  </title>
                  <caption>
                     <p>GT and GC donor splice sites and conversion events in AT-rich and GC-rich regions<sup>a</sup></p>
                  </caption>
                  <tblbdy cols="4">
                     <r>
                        <c ca="left">
                           <p>Splice sites/conversion events</p>
                        </c>
                        <c ca="left">
                           <p>%A+T > 50%</p>
                        </c>
                        <c ca="left">
                           <p>%A+T &lt; 50%</p>
                        </c>
                        <c ca="left">
                           <p>P<sub>Fisher</sub>/GT<sup>b</sup></p>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Total number of donor splice sites(%)</p>
                        </c>
                        <c ca="left">
                           <p>51798(100)</p>
                        </c>
                        <c ca="left">
                           <p>71396(100)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>GT</p>
                        </c>
                        <c ca="left">
                           <p>51575(99.57)</p>
                        </c>
                        <c ca="left">
                           <p>71028(99.48)</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>GC</p>
                        </c>
                        <c ca="left">
                           <p>88(0.17)</p>
                        </c>
                        <c ca="left">
                           <p>165(0.23)</p>
                        </c>
                        <c ca="left">
                           <p>0.02</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>GT>GC</p>
                        </c>
                        <c ca="left">
                           <p>113(0.22)</p>
                        </c>
                        <c ca="left">
                           <p>185(0.26)</p>
                        </c>
                        <c ca="left">
                           <p>0.15</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>GC>GT</p>
                        </c>
                        <c ca="left">
                           <p>22(0.04)</p>
                        </c>
                        <c ca="left">
                           <p>18(0.03)</p>
                        </c>
                        <c ca="left">
                           <p>0.11</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p><sup>a</sup>AT content was estimated in &#177; 100 base pair regions surrounding donor splice sites</p>
                     <p><sup>b</sup>Statistical significance was estimated by comparing the respective values with the number of GT sites using Fisher's two-tailed test</p>
                  </tblfn>
               </tbl>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Reviewer's report 2: Anton Nekrutenko, Pennsylvania State University</p>
            </st>
            <p>In this discovery note authors point out accumulation of non- canonical GC donor splice signals in mammals, against the previously observed nucleotide substitution bias. They provide a convincing explanation suggesting that GT->GC conversion may be beneficial for mammals as it creates additional possibilities for alternative splicing events. In my opinion these observations provide a platform for launching more detailed investigation of alternative splicing through comparative genomics and raise numerous interesting question (e.g., are any of the GC sites overlap with known SNPs?). Thus publication of this note will appeal to a broad evolutionary genomics community.</p>
            <p>On a technical side the authors used a rather complex procedure for retrieving splice sites from TBA alignments. Instead, this can be easily and quickly done using Galaxy system:</p>
            <p>
               <url>http://main.g2.bx.psu.edu</url>
            </p>
            <p>as explained here:</p>
            <p>
               <url>http://g2.trac.bx.psu.edu/wiki/MAFanalysis</url>
            </p>
            <sec>
               <st>
                  <p>Authors' response</p>
               </st>
               <p><it>We appreciate the reviewer pointing out the utility of the Galaxy platform and hope to exploit Galaxy in future genome analyses</it>.</p>
            </sec>
         </sec>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The research of IBR and EVK is supported by the Department of Health and Human Services intramural program (NIH, National Library of Medicine). The research of AC and SWH is supported by an NIH National Library of Medicine K-22 award (K22LM008794), and via private funding from the Research Institute for Children, New Orleans.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Pre-mRNA splicing: awash in a sea of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Jurica</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Moore</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>Mol Cell</source>
            <pubdate>2003</pubdate>
            <volume>12</volume>
            <issue>1</issue>
            <fpage>5</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1097-2765(03)00270-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">12887888</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The spliceosome: the most complex macromolecular machine in the cell?</p>
            </title>
            <aug>
               <au>
                  <snm>Nilsen</snm>
                  <fnm>TW</fnm>
               </au>
            </aug>
            <source>Bioessays</source>
            <pubdate>2003</pubdate>
            <volume>25</volume>
            <issue>12</issue>
            <fpage>1147</fpage>
            <lpage>1149</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/bies.10394</pubid>
                  <pubid idtype="pmpid" link="fulltext">14635248</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Complex spliceosomal organization ancestral to extant eukaryotes</p>
            </title>
            <aug>
               <au>
                  <snm>Collins</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Penny</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2005</pubdate>
            <volume>22</volume>
            <issue>4</issue>
            <fpage>1053</fpage>
            <lpage>1066</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msi091</pubid>
                  <pubid idtype="pmpid" link="fulltext">15659557</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Splicing double: insights from the second spliceosome</p>
            </title>
            <aug>
               <au>
                  <snm>Patel</snm>
                  <fnm>AA</fnm>
               </au>
               <au>
                  <snm>Steitz</snm>
                  <fnm>JA</fnm>
               </au>
            </aug>
            <source>Nat Rev Mol Cell Biol</source>
            <pubdate>2003</pubdate>
            <volume>4</volume>
            <issue>12</issue>
            <fpage>960</fpage>
            <lpage>970</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nrm1259</pubid>
                  <pubid idtype="pmpid" link="fulltext">14685174</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Who's on first? The U1 snRNP-5' splice site interaction and splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Rosbash</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Seraphin</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>1991</pubdate>
            <volume>16</volume>
            <issue>5</issue>
            <fpage>187</fpage>
            <lpage>190</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0968-0004(91)90073-5</pubid>
                  <pubid idtype="pmpid">1882420</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>The U1 snRNP protein U1C recognizes the 5' splice site in the absence of base pairing</p>
            </title>
            <aug>
               <au>
                  <snm>Du</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Rosbash</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2002</pubdate>
            <volume>419</volume>
            <issue>6902</issue>
            <fpage>86</fpage>
            <lpage>90</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature00947</pubid>
                  <pubid idtype="pmpid" link="fulltext">12214237</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Comparative analysis detects dependencies among the 5' splice-site positions</p>
            </title>
            <aug>
               <au>
                  <snm>Carmel</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Tal</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Vig</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Ast</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Rna</source>
            <pubdate>2004</pubdate>
            <volume>10</volume>
            <issue>5</issue>
            <fpage>828</fpage>
            <lpage>840</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1370573</pubid>
                  <pubid idtype="pmpid" link="fulltext">15100438</pubid>
                  <pubid idtype="doi">10.1261/rna.5196404</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Analysis of canonical and non-canonical splice sites in mammalian genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Burset</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Seledtsov</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Solovyev</snm>
                  <fnm>VV</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2000</pubdate>
            <volume>28</volume>
            <issue>21</issue>
            <fpage>4364</fpage>
            <lpage>4375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">113136</pubid>
                  <pubid idtype="pmpid" link="fulltext">11058137</pubid>
                  <pubid idtype="doi">10.1093/nar/28.21.4364</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Human GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions</p>
            </title>
            <aug>
               <au>
                  <snm>Thanaraj</snm>
                  <fnm>TA</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2001</pubdate>
            <volume>29</volume>
            <issue>12</issue>
            <fpage>2581</fpage>
            <lpage>2593</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">55748</pubid>
                  <pubid idtype="pmpid" link="fulltext">11410667</pubid>
                  <pubid idtype="doi">10.1093/nar/29.12.2581</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Comparison of splice sites in mammals and chicken</p>
            </title>
            <aug>
               <au>
                  <snm>Abril</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Castelo</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Guigo</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2005</pubdate>
            <volume>15</volume>
            <issue>1</issue>
            <fpage>111</fpage>
            <lpage>119</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540285</pubid>
                  <pubid idtype="pmpid" link="fulltext">15590946</pubid>
                  <pubid idtype="doi">10.1101/gr.3108805</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Analysis of the role of Caenorhabditis elegans GC-AG introns in regulated splicing</p>
            </title>
            <aug>
               <au>
                  <snm>Farrer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Roller</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Zahler</snm>
                  <fnm>AM</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <issue>15</issue>
            <fpage>3360</fpage>
            <lpage>3367</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">137088</pubid>
                  <pubid idtype="pmpid" link="fulltext">12140320</pubid>
                  <pubid idtype="doi">10.1093/nar/gkf465</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Comprehensive analysis of alternative splicing in rice and comparative analyses with <it>Arabidopsis</it></p>
            </title>
            <aug>
               <au>
                  <snm>Campbell</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Haas</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Hamilton</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Mount</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Buell</snm>
                  <fnm>CR</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>327</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1769492</pubid>
                  <pubid idtype="pmpid" link="fulltext">17194304</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-7-327</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Aligning multiple genomic sequences with the threaded blockset aligner</p>
            </title>
            <aug>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Riemer</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Roskin</snm>
                  <fnm>KM</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Rosenbloom</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Clawson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2004</pubdate>
            <volume>14</volume>
            <issue>4</issue>
            <fpage>708</fpage>
            <lpage>715</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">383317</pubid>
                  <pubid idtype="pmpid" link="fulltext">15060014</pubid>
                  <pubid idtype="doi">10.1101/gr.1933104</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>The UCSC genome browser database: update 2007</p>
            </title>
            <aug>
               <au>
                  <snm>Kuhn</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Zweig</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Trumbower</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Thakkapallayil</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sugnet</snm>
                  <fnm>CW</fnm>
               </au>
               <au>
                  <snm>Stanke</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>KE</fnm>
               </au>
               <au>
                  <snm>Siepel</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Rosenbloom</snm>
                  <fnm>KR</fnm>
               </au>
               <au>
                  <snm>Rhead</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Raney</snm>
                  <fnm>BJ</fnm>
               </au>
               <au>
                  <snm>Pohl</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pedersen</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Hsu</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Hinrichs</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Harte</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Diekhans</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Clawson</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bejerano</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Barber</snm>
                  <fnm>GP</fnm>
               </au>
               <au>
                  <snm>Baertsch</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <issue>Database issue</issue>
            <fpage>D668</fpage>
            <lpage>73</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1669757</pubid>
                  <pubid idtype="pmpid" link="fulltext">17142222</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl928</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Ecdysozoan clade rejected by genome-wide analysis of rare amino acid replacements</p>
            </title>
            <aug>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Wolf</snm>
                  <fnm>YI</fnm>
               </au>
               <au>
                  <snm>Carmel</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Koonin</snm>
                  <fnm>EV</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>2007</pubdate>
            <volume>24</volume>
            <issue>4</issue>
            <fpage>1080</fpage>
            <lpage>1090</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/molbev/msm029</pubid>
                  <pubid idtype="pmpid" link="fulltext">17299026</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Comparative analyses of multi-species sequences from targeted genomic regions</p>
            </title>
            <aug>
               <au>
                  <snm>Thomas</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Touchman</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Blakesley</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Bouffard</snm>
                  <fnm>GG</fnm>
               </au>
               <au>
                  <snm>Beckstrom-Sternberg</snm>
                  <fnm>SM</fnm>
               </au>
               <au>
                  <snm>Margulies</snm>
                  <fnm>EH</fnm>
               </au>
               <au>
                  <snm>Blanchette</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Siepel</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Thomas</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>McDowell</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Maskeri</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hansen</snm>
                  <fnm>NF</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Weber</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Kent</snm>
                  <fnm>WJ</fnm>
               </au>
               <au>
                  <snm>Karolchik</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bruen</snm>
                  <fnm>TC</fnm>
               </au>
               <au>
                  <snm>Bevan</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Cutler</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Schwartz</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Elnitski</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Idol</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Prasad</snm>
                  <fnm>AB</fnm>
               </au>
               <au>
                  <snm>Lee-Lin</snm>
                  <fnm>SQ</fnm>
               </au>
               <au>
                  <snm>Maduro</snm>
                  <fnm>VV</fnm>
               </au>
               <au>
                  <snm>Summers</snm>
                  <fnm>TJ</fnm>
               </au>
               <au>
                  <snm>Portnoy</snm>
                  <fnm>ME</fnm>
               </au>
               <au>
                  <snm>Dietrich</snm>
                  <fnm>NL</fnm>
               </au>
               <au>
                  <snm>Akhter</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Ayele</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Benjamin</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Cariaga</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Brinkley</snm>
                  <fnm>CP</fnm>
               </au>
               <au>
                  <snm>Brooks</snm>
                  <fnm>SY</fnm>
               </au>
               <au>
                  <snm>Granite</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Guan</snm>
                  <fnm>X</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Haghighi</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Ho</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Karlins</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Laric</snm>
                  <fnm>PL</fnm>
               </au>
               <au>
                  <snm>Legaspi</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Lim</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Maduro</snm>
                  <fnm>QL</fnm>
               </au>
               <au>
                  <snm>Masiello</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Mastrian</snm>
                  <fnm>SD</fnm>
               </au>
               <au>
                  <snm>McCloskey</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Pearson</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Stantripop</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tiongson</snm>
                  <fnm>EE</fnm>
               </au>
               <au>
                  <snm>Tran</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Tsurgeon</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Vogt</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Walker</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Wetherby</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Wiggins</snm>
                  <fnm>LS</fnm>
               </au>
               <au>
                  <snm>Young</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>LH</fnm>
               </au>
               <au>
                  <snm>Osoegawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Shu</snm>
                  <fnm>CL</fnm>
               </au>
               <au>
                  <snm>De Jong</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Lawrence</snm>
                  <fnm>CE</fnm>
               </au>
               <au>
                  <snm>Smit</snm>
                  <fnm>AF</fnm>
               </au>
               <au>
                  <snm>Chakravarti</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Haussler</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Green</snm>
                  <fnm>ED</fnm>
               </au>
            </aug>
            <source>Nature</source>
            <pubdate>2003</pubdate>
            <volume>424</volume>
            <issue>6950</issue>
            <fpage>788</fpage>
            <lpage>793</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/nature01858</pubid>
                  <pubid idtype="pmpid" link="fulltext">12917688</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Neighboring base effect on emergence of spontaneous mutations in human pseudogenes</p>
            </title>
            <aug>
               <au>
                  <snm>Pozdniakov</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Rogozin</snm>
                  <fnm>IB</fnm>
               </au>
               <au>
                  <snm>Babenko</snm>
                  <fnm>VN</fnm>
               </au>
               <au>
                  <snm>Kolchanov</snm>
                  <fnm>NA</fnm>
               </au>
            </aug>
            <source>Dokl Akad Nauk</source>
            <pubdate>1997</pubdate>
            <volume>356</volume>
            <issue>4</issue>
            <fpage>566</fpage>
            <lpage>568</lpage>
            <xrefbib>
               <pubid idtype="pmpid">9424178</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Patterns of nucleotide substitution in <it>Drosophila</it> and mammalian genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Petrov</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Hartl</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1999</pubdate>
            <volume>96</volume>
            <issue>4</issue>
            <fpage>1475</fpage>
            <lpage>1479</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">15487</pubid>
                  <pubid idtype="pmpid" link="fulltext">9990048</pubid>
                  <pubid idtype="doi">10.1073/pnas.96.4.1475</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>Directionality of point mutation and 5-methylcytosine deamination rates in the chimpanzee genome</p>
            </title>
            <aug>
               <au>
                  <snm>Jiang</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Zhao</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>BMC Genomics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>316</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1764022</pubid>
                  <pubid idtype="pmpid" link="fulltext">17166280</pubid>
                  <pubid idtype="doi">10.1186/1471-2164-7-316</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

