<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1745-6150-2-1</ui>
   <ji>1745-6150</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>pkaPS: prediction of protein kinase A phosphorylation sites with the simplified kinase-substrate binding model</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Neuberger</snm>
               <fnm>Georg</fnm>
               <insr iid="I1"/>
               <email>neuberger@imp.univie.ac.at</email>
            </au>
            <au id="A2">
               <snm>Schneider</snm>
               <fnm>Georg</fnm>
               <insr iid="I1"/>
               <email>schneider@imp.univie.ac.at</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Eisenhaber</snm>
               <fnm>Frank</fnm>
               <insr iid="I1"/>
               <email>Frank.Eisenhaber@imp.univie.ac.at</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>IMP &#8211; Research Institute of Molecular Pathology, Dr. Bohr-Gasse 7, A-1030 Vienna, Austria</p>
            </ins>
         </insg>
         <source>Biology Direct</source>
         <issn>1745-6150</issn>
         <pubdate>2007</pubdate>
         <volume>2</volume>
         <issue>1</issue>
         <fpage>1</fpage>
         <url>http://www.biology-direct.com/content/2/1/1</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17222345</pubid>
               <pubid idtype="doi">10.1186/1745-6150-2-1</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>09</day>
               <month>1</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>12</day>
               <month>1</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>12</day>
               <month>1</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Neuberger et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Protein kinase A (cAMP-dependent kinase, PKA) is a serine/threonine kinase, for which ca. 150 substrate proteins are known. Based on a refinement of the recognition motif using the available experimental data, we wished to apply the simplified substrate protein binding model for accurate prediction of PKA phosphorylation sites, an approach that was previously successful for the prediction of lipid posttranslational modifications and of the PTS1 peroxisomal translocation signal.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>Approximately 20 sequence positions flanking the phosphorylated residue on both sides have been found to be restricted in their sequence variability (region -18...+23 with the site at position 0). The conserved physical pattern can be rationalized in terms of a qualitative binding model with the catalytic cleft of the protein kinase A. Positions -6...+4 surrounding the phosphorylation site are influenced by direct interaction with the kinase in a varying degree. This sequence stretch is embedded in an intrinsically disordered region composed preferentially of hydrophilic residues with flexible backbone and small side chain. This knowledge has been incorporated into a simplified analytical model of productive binding of substrate proteins with PKA.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>The scoring function of the pkaPS predictor can confidently discriminate PKA phosphorylation sites from serines/threonines with non-permissive sequence environments (sensitivity of ~96% at a specificity of ~94%). The tool "pkaPS" has been applied on the whole human proteome. Among new predicted PKA targets, there are entirely uncharacterized protein groups as well as apparently well-known families such as those of the ribosomal proteins L21e, L22 and L6.</p>
            </sec>
            <sec>
               <st>
                  <p>Availability</p>
               </st>
               <p>The supplementary data as well as the prediction tool as WWW server are available at <url>http://mendel.imp.univie.ac.at/sat/pkaPS</url>.</p>
            </sec>
            <sec>
               <st>
                  <p>Reviewers</p>
               </st>
               <p>Erik van Nimwegen (Biozentrum, University of Basel, Switzerland), Sandor Pongor (International Centre for Genetic Engineering and Biotechnology, Trieste, Italy), Igor Zhulin (University of Tennessee, Oak Ridge National Laboratory, USA).</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <meta>
      <classifications>
         <classification type="bmc" subtype="user_supplied_xml" id="refman"/>
      </classifications>
   </meta>
   <bdy>
      <sec>
         <st>
            <p>Open peer review</p>
         </st>
         <p>This article was reviewed by Erik van Nimwegen (Biozentrum, University of Basel, Switzerland), Sandor Pongor (International Centre for Genetic Engineering and Biotechnology, Trieste, Italy) and Igor Zhulin (University of Tennessee, Oak Ridge National Laboratory, USA). For the full reviews, please go to the Reviewers' comments section.</p>
      </sec>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>Phosphorylation is one of the biologically most important post-translational modifications known today. Eukaryote kinases, which are the enzymes that are responsible for this type of chemical alteration, transfer phosphate moieties onto the hydroxyl groups of serines, threonines or tyrosines of substrate peptides. Phosphorylation plays a key role in a large set of signal transduction pathways and is known to regulate the functions of a vast number of different proteins. Not only are substrate motifs for phosphorylation found in proteins from various cellular contexts, there are also >500 kinases <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> with at least partly non-overlapping substrate specificities encoded in each of the higher eukaryote genomes. This broad distribution, coupled with the potential medical applications, makes them interesting research targets with regard to their role in signaling cascades. Therefore, it is important to determine the complete protein substrate set for each kinase. The sheer number of yet uncharacterized proteins implies that a lot of phosphorylation motifs remained undetected so far. Accurate <it>in silico </it>predictors recognizing kinase substrates from their amino acid sequences are desirable to bring this task closer to a solution. A low false-positive prediction rate is especially important in this context.</p>
         <p>Protein kinase A (PKA), alternatively called cAMP-dependent protein kinase, is one of the best studied members of the kinase group of enzymes and, therefore, appears among the most attractive targets for substrate site predictor development. It is actually the first kinase for which the crystal structure has been resolved <abbrgrp><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>. PKA acts on serine and, to a lesser extent, threonine residues that are embedded in a specific recognition motif. In its first characterizations, the PKA motif was described as consisting of arginines at the 3<sup>rd </sup>and 2<sup>nd </sup>positions prior to the phosphorylation site, and of a large hydrophobic amino acid immediately thereafter <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp>.</p>
         <p>Several groups already applied various approaches for predicting PKA phosphorylation sites from primary protein sequence. NETPHOS <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> was one of the first to outperform simpler PROSITE-like approaches <abbrgrp><abbr bid="B9">9</abbr><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr></abbrgrp> by applying artificial neural networks. A more recent version, NETPHOSK <abbrgrp><abbr bid="B12">12</abbr></abbrgrp>, makes kinase-specific predictions. SCANSITE 2.0 uses position-specific scoring matrices (PSSM) to predict phosphorylation motifs for 62 different kinases, again including PKA <abbrgrp><abbr bid="B13">13</abbr></abbrgrp>. PREDPHOSPHO is a kinase-specific predictor that uses support vector machines <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>. GPS does not use standard machine learning approaches but implements a so-called group-based scoring technique, which makes use of the BLOSUM62 matrix to score distances between query sequences and known clusters of kinase substrate peptides <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp>. As GPS focuses on straight sequence similarity traits, the likelihood for GPS to recognize query peptides as phosphorylation targets that are similar to known sites is especially high whereas GPS might have difficulties if it is confronted with unusual substrate examples of the same kinase that are not reflected in the learning set. Among these tools, GPS <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr></abbrgrp> and PREDPHOSPHO <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> appear to have highest accuracy. Although the sequence sets used for testing are limited, their sensitivities are clearly below 90% for specificities estimated to be close to 90%. As more than 10% of the query sites are expected to be misclassified, database-wide studies that rely solely on current predictors cannot produce reliable results.</p>
         <p>In order to achieve higher sensitivity and specificity, major improvements are needed. In this work, we implemented two new aspects: (i) Since there is no "average phosphorylation site", high prediction accuracy can only be achieved if the function for scoring of putative phosphorylation sites is specific for each kinase system. In our approach, the scoring function is thought to estimate the probability of productive binding of the respective substrate protein segment with the binding site of PKA; thus, the scoring function is a simplified physical model of the binding process <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>. (ii) The motif regions that are used to discriminate between true sites and non-permissive targets should be as long as possible. These shall include all substrate sequence stretches that influence the binding process and should not be restricted to the region of the motif that is most conserved in terms of amino acid types. It is also necessary to consider properties of correlated motif positions <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>.</p>
         <p>It should be noted that, for most post-translational modifications, only a handful of substrate proteins per modifying enzyme is known. Even for the better studied cases, the available experimental information can only reliably parameterize a scoring function with a small number of fitted values. In similar cases of predictor development such as for GPI lipid anchoring <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>, N-terminal N-myristoylation <abbrgrp><abbr bid="B20">20</abbr></abbrgrp>, prenylation <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> and peroxisomal targeting <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B24">24</abbr></abbrgrp>, our simplified substrate protein binding model has been successfully applied. It should be noted that, in all these cases, the sequence signal coding for the posttranslational modification or the translocation is located either at the N- or C-terminal end of the polypeptide chain. In this work, we wanted to test the approach for an internal sequence signal.</p>
         <p>PKA-dependent phosphorylation is an excellent example in this context since the rich experimental data allow for the derivation of a quite accurate qualitative binding site model as we show in this work. Not only are there more than 200 documented phosphorylation sites for PKA. The available sequence data is also accompanied by other valuable heterogeneous information such as 3D data and mutation experiments <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B25">25</abbr></abbrgrp>.</p>
      </sec>
      <sec>
         <st>
            <p>Results</p>
         </st>
         <sec>
            <st>
               <p>Overview</p>
            </st>
            <p>The whole work consist of two major parts &#8211; first, the derivation of the property pattern that characterizes sequence segments with PKA phosphorylation sites and, second, the development and the validation of a prediction tool for the recognition of PKA phosphorylation sites in query sequences.</p>
            <p>The following four sections of the Results ("The motif length", "Positive charge in the N-terminal flank", "Polarity and flexibility in the C-terminal flank", "Phylogenetic variation of the substrate binding site of PKA") are dedicated to the derivation of the sequence motif coding for PKA phosphorylation sites. This work is based on analyses of the sequence environment of known phosphorylation sites in substrate proteins and of the PKA sequences and structures. We correlate amino acid compositions at various alignment positions with physical properties of amino acid residues. As major results, we obtain the sequence length of the motif and the pattern of physical properties in various sequence segments surrounding the phosphorylation site. Moreover, if several phosphorylation sites occur in one protein, they tend to be sequentially clustered.</p>
            <p>The next three sections ("Predictor description and the self-consistency test", "Neighbor-jackknife test", "Summary of the prediction performance and comparison to other tools") describe the development of the prediction tool and its validation with the self-consistency test and a rigorous cross-validation procedure called neighbor-jackknife test (exclusion of groups of sequentially similar proteins). The specificity and the sensitivity values are close to 95% and, thus, superior compared with previously published predictors.</p>
            <p>The succeding section of the Results ("Prediction of PKA targets within the human proteome") describes the application of the predictor to the human proteome. Among new predicted PKA targets, there are entirely uncharacterized protein groups as well as apparently well-known families such as those of the ribosomal proteins L21e, L22 and L6. The last section of the Results ("Description of the associated WWW site") supplies information about the PKA WWW server.</p>
            <sec>
               <st>
                  <p>The motif length</p>
               </st>
               <p>The deduction of accurate motif boundaries is not straightforward, as this region also comprises positions that make only minor contributions to substrate recognition by PKA. For example, these include residues that interact only weakly with the receptor or which are context-dependent upon neighboring positions. As a consequence, it is helpful to base such estimations on a standard model which has already been successfully applied in related situations.</p>
               <p>The concept of a linker-embedded binding motif is utterly suited for this task. The underlying assumption is that the peptide stretch which binds to the receptor enzyme and which is buried in the catalytic cleft must first be made accessible for interaction: as part of an intrinsically disordered region, through a permanent native location on the surface of the globular part of the substrate protein or via exposure after an induced conformational change. As a consequence, the flanking regions which connect the sequence segment that fits into the catalytic cavity and the rest of the substrate protein must have sufficient conformational flexibility and hydrophilicity. Such a motif structure has already been observed and successfully applied in predictor development <abbrgrp><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B26">26</abbr></abbrgrp>. Recent work by Dunker and co-workers further confirms the applicability of this model to protein phosphorylation motifs as they find evidence for inherently disordered regions surrounding phosphorylated residues. They used a similar formulation of the concept for "disorder enhanced" prediction of phosphorylation sites <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>.</p>
               <p>Mean values of amino acid property indices <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B28">28</abbr></abbrgrp> (including many flexibility and hydrophobicity scales) were calculated over a gapless multiple alignment of learning set sequences which consists of the modified sites in the center together with 40 flanking residues on each side. Sequence redundancy was removed by applying a method which involves frequencies of identical residues on alignment positions -6 to +6 (Materials and Methods). Exemplarily, we show the outcomes obtained for the hydrophobicity scale EISD840101 <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> and for VINM940104 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp> as flexibility measure in Figure <figr fid="F1">1</figr>.</p>
               <fig id="F1">
                  <title>
                     <p>Figure 1</p>
                  </title>
                  <caption>
                     <p>Variation of hydrophobicity and of flexibility over the motif region</p>
                  </caption>
                  <text>
                     <p><b>Variation of hydrophobicity and of flexibility over the motif region</b>. The graph depicts the mean value deviations of the hydrophobicity-related property EISD840101 [29] and the flexibility scale VINM940104 [30] over the 81 positions that encompass the learning set sites. The mean values are presented as deviations from the UNIREF average (baseline) in percent of UNIREF standard deviations. The plots were smoothed by applying sliding windows (running averages) over 5 residues. Mean values were calculated using two different sequence sets: (i) one that contains all entries from the learning set, and (ii) one that consists of all proteins that are phosphorylated only once in the learning set. The difference between these two curves is not dramatic although, as a trend, the property values appear to fall back more sharply to the database values if only proteins with single PKA phosphorylation sites are taken into account.</p>
                  </text>
                  <graphic file="1745-6150-2-1-1"/>
               </fig>
               <p>The calculated values deviate from the database averages over a sequence stretch that covers about twenty positions both the N- and at the C-terminal side of the documented phosphorylation site. The curves fall back to the average database values with increasing distance from the phosphorylated site. Moreover, similar behavior is exhibited by many other hydrophobicity- and flexibility-related properties (data not shown). It appears also interesting that this region is slightly longer on the C-terminal side than on the N-terminal one. This might be a result of the more hydrophobic nature of the residues that lie adjacent to the phosphorylated site on the C-terminal side. As depicted in Figure <figr fid="F1">1</figr>, the motif boundaries cannot be boiled down unambiguously to a unique position. We set the edges well into the regions where the property mean values do not fall below the steady level of approximately &#177; 20%. The resulting region is defined from positions -18 to +23 and, thus, we estimate the total length of the sequence signal for PKA-dependent phosphorylation as 42 positions.</p>
               <p>The significance of multiple phosphorylated residues within the same motif region is an important issue that needs to be addressed (see also legend to Figure <figr fid="F1">1</figr>). We find that pairs of phosphorylated serines/threonines are not separated farther than the 50 residues in the sequence in two thirds of all cases (Figure <figr fid="F2">2</figr>). This threshold is just about the motif length derived above. Theoretically, every proximal neighboring site would prolong one of the linkers by at least the distance between the sites. From a biological point of view, it appears reasonable to pack multiple phosphorylation sites closely together. In such a situation, long additional linker stretches that would be necessary for maintaining an inherent structural disorder in the environment of phosphorylation sites <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> are avoided.</p>
               <fig id="F2">
                  <title>
                     <p>Figure 2</p>
                  </title>
                  <caption>
                     <p>Cumulative distribution of distances between successive sites in learning set proteins with multiple phosphorylated serine/threonine residues</p>
                  </caption>
                  <text>
                     <p><b>Cumulative distribution of distances between successive sites in learning set proteins with multiple phosphorylated serine/threonine residues</b>. The figure demonstrates that about two thirds of all distances are within the extended motif length of approximately 50 positions. The maximum distance, which exceeds the displayed x-axis, is 1759 amino acids.</p>
                  </text>
                  <graphic file="1745-6150-2-1-2"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Positive charge in the N-terminal flank</p>
               </st>
               <p>Historically, charge requirements were the first observed characteristics of the PKA motif. Kinetic studies at the end of the 1970s revealed a cluster of positive residues directly N-terminally of the phosphorylated site as main determinant for PKA substrate specificity. The main constituents of this cluster are the 2<sup>nd </sup>and 3<sup>rd </sup>positions prior to the phosphorylated serine or threonine. Kemp <it>et al</it>. <abbrgrp><abbr bid="B5">5</abbr></abbrgrp> postulated that at least one arginine should be present at one of these locations. Moreover, replacement of the arginine by lysine was reported to cause less activity loss than substitutions by other amino acids. In another study <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>, the adjacent arginines were positioned at various distances from the phosphorylated site and activity measurements were performed. The results demonstrated that the binding affinity is indeed highest at positions -3/-2 and decreases with increasing distance from the site.</p>
               <p>The requirement for positively charged residues is depicted in the 3D structure of PKA bound to an inhibitor peptide (Figure <figr fid="F3">3</figr>). The arginines at positions -3 and -2 interact with Glu127 and Glu170/Glu230 of the PKA enzyme, respectively <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. Both substrate residues make close contacts with the enzyme in a spatially restricted binding pocket, explaining their importance in determining substrate specificity. In this structure, the arginine at position -6 also contributes to substrate recognition by interacting with Glu203.</p>
               <fig id="F3">
                  <title>
                     <p>Figure 3</p>
                  </title>
                  <caption>
                     <p>Structure of the inhibitor peptide PKI bound to the PKA enzyme: N-terminal region of the substrate</p>
                  </caption>
                  <text>
                     <p><b>Structure of the inhibitor peptide PKI bound to the PKA enzyme: N-terminal region of the substrate</b>. Key arginines from the substrate peptide (RCSB Protein Data Bank entry 1JLU [92]) are highlighted. The left part of the figure shows the surface of PKA in ochre, the backbone of the substrate peptide in silver and the arginines -6, -3 and -2 of the substrate in blue. Arginines -3 and -2 interact with the binding cleft and thereby make major contributions to substrate specificity. A set of acidic enzyme residues interacts with these arginines (zoomed detail-view to the right): Glu170 and Glu230 for Arg-2, Glu127 for Arg-3 and Glu203 for Arg-6 [3]. The pictures were generated using VMD [93].</p>
                  </text>
                  <graphic file="1745-6150-2-1-3"/>
               </fig>
               <p>The requirement for positive charge is highest for residues -2 and -3 but can be detected as far as 6 to 8 residues prior to the phosphorylated serine (Figure <figr fid="F4">4</figr>). Several studies focus on the role of position -6 as this residue apparently interacts with the PKA enzyme <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B7">7</abbr></abbrgrp>. In contrast, it is unclear how the amino acids at positions -5 and -4 contribute to substrate specificity. Although positive charge at these locations is as much favored as for position -6, neither of them makes close contacts with PKA in any solved structure. The reason could lie in a variable structural context of this N-terminal region. The currently resolved substrate-bound 3D structures have typically been obtained using the same inhibitor peptide (PKI). Thus, other, yet unknown conformations might exist if the bound peptide does not involve a positively charged residue at position -6. Alternatively, long-range charge interactions might contribute to substrate specificity at these positions. The preference for positive charge is further confirmed by Songyang <it>et al</it>. <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>, who used an oriented peptide library to demonstrate that the positional range -4 to -1 has strong preferences for arginine and to, a lesser extent, for histidine or lysine.</p>
               <fig id="F4">
                  <title>
                     <p>Figure 4</p>
                  </title>
                  <caption>
                     <p>Preference for positive charge at positions located N-terminally with regard to the phosphorylated site</p>
                  </caption>
                  <text>
                     <p><b>Preference for positive charge at positions located N-terminally with regard to the phosphorylated site</b>. The upper graph depicts the increased occurrence of positively charged residues (His, Lys, Arg) compared to the expected database occurrence of 13.6% (deduced from UNIREF). The lower part of the figure shows the correlation coefficients R between amino acid frequencies and ZIMJ680104 (isoelectric point) [87] property values. Both plots demonstrate that the preference for basic residues is highest at positions -3 and -2, but encompasses at least the entire region between amino acids -6 and -2.</p>
                  </text>
                  <graphic file="1745-6150-2-1-4"/>
               </fig>
               <p>Physico-chemical preferences in the region prior to the phosphorylation site are complemented with flexibility and polarity requirements, e.g. for the property VINM940103 (normalized flexibility parameters <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>, R &#8805; 0.62) at positions -8 to -6 and -4, or for the hydrophilicity-related scales EISD840101 <abbrgrp><abbr bid="B29">29</abbr></abbrgrp> (R &#8805; -0.66 at position -3 and -0.69 at position -2) and KRIW790102 <abbrgrp><abbr bid="B32">32</abbr></abbrgrp> (R = 0.60 at positions -7, -6 and -4). Although these might be a remnant of charge requirements, it seems clear that a substitution of arginine by hydrophilic residues is less disfavored than an exchange by bulky, apolar amino acids.</p>
            </sec>
            <sec>
               <st>
                  <p>Polarity and flexibility in the C-terminal flank</p>
               </st>
               <p>The residue at position +1 lies in vicinity of a hydrophobic pocket that is built up by the side chains of Leu198, Pro202 and Leu205 (Figure <figr fid="F5">5</figr>). As a consequence, a large hydrophobic residue was found to be favored at this substrate position <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. A value of R = 0.78 for NAKH900109 (amino acid composition of membrane proteins <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>) confirms the detected tendency for hydrophobic, apolar residues. Also, analysis of mean value deviations from the expected database average indicates a preference for amino acids that occur more frequently in &#946;-strands. Properties such as GEIM800105 (&#946;-strand indices <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>) or KANM800104 (average probability for inner &#946;-sheet <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>) produce significant t-values of 2.71 (99.2%) and 2.59 (98.9%), respectively. These secondary structure scales typically have elevated property values for aliphatic and aromatic amino acids.</p>
               <fig id="F5">
                  <title>
                     <p>Figure 5</p>
                  </title>
                  <caption>
                     <p>Structure of the inhibitor peptide PKI bound to the PKA enzyme: C-terminal region of the substrate</p>
                  </caption>
                  <text>
                     <p><b>Structure of the inhibitor peptide PKI bound to the PKA enzyme: C-terminal region of the substrate</b>. Overall (left) and detail views (right) of the substrate region that lies on the C-terminal side of the phosphorylated serine in complex with the kinase PKA (RCSB Protein Data Bank entry 1JLU [92]) are shown. Ile+1, His+2 and Asp+3 of the PKI substrate as well as the surface of the PKA enzyme to the left are colored according to residue types: white/gray for apolar, green for polar, blue for basic, and red for acidic amino acids. Compared with Figure 3, the orientation of the complex roughly corresponds to a counterclockwise rotation of 90 degrees around the vertical axis. The detail view to the right shows the hydrophobic patch at the surface of PKA which interacts with the substrate residue that lies C-terminally adjacent to the phosphorylated site. The pictures were generated using VMD [93].</p>
                  </text>
                  <graphic file="1745-6150-2-1-5"/>
               </fig>
               <p>Interestingly, correlation effects can be detected between positions +1 and +4, as indicated by an F-value of 1.38 (96.6%) for the property GEIM800107 (&#946;-strand indices for &#946;-proteins <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>). Few data about the role of residue +4 is available from the literature, as this position is missing in the currently resolved 3D structures. It has no clear amino acid preferences, although it is preferentially less polar than the clearly hydrophilic surrounding positions (data not shown). Its spatial location in vicinity of the hydrophobic patch (Figure <figr fid="F5">5</figr>) combined with the correlations with residue +1 could suggest that positions +1 and +4 both may interact with the apolar surface loop of PKA. However, alternative conformations which involve an apolar residue at position +3 also appear possible.</p>
               <p>The intermediary positions +2 and +3 can be characterized by a preference for small residues. Numerous size-related scales such as FASG760101 <abbrgrp><abbr bid="B36">36</abbr></abbrgrp> (R of -0.62 and -0.63 for positions +2 and +3, respectively) produce significant correlation coefficients. To some extent, position +3 also seems to favor flexible amino acids, as indicated by a correlation coefficient R of 0.65 for VINM940102 <abbrgrp><abbr bid="B30">30</abbr></abbrgrp>. The respective substrate positions indeed lie in a spatially constrained region at the mouth of the binding cavity (Figure <figr fid="F5">5</figr>), which explains the appearance of size restrictions.</p>
            </sec>
            <sec>
               <st>
                  <p>Phylogenetic variation of the substrate binding site of PKA</p>
               </st>
               <p>When collecting the learning set substrate proteins, we found 50% human and 89% mammalian example sites. The remaining 11% were from other metazoan species, yeasts and plants (see Materials and Methods for detail). We wished to estimate to which extent substrates and enzymes from various organisms are exchangeable with respect to PKA-dependent phosphorylation. In Figure <figr fid="F6">6</figr>, we show the alignment of the sequences of the catalytic subunit of PKA in a large variety of organisms spreading from yeast to human. Positions that are critically important for binding the substrate protein stretch are marked with triangles. Not only are these positions 100% conserved among all sequences shown, but even their sequence environment is almost unchanged among taxa. Therefore, we suggest that substrates for the human PKA are most likely also substrates for PKA of other taxa and a predictor for recognizing human substrates can also be used for finding PKA substrates in other eukaryote organisms.</p>
               <fig id="F6">
                  <title>
                     <p>Figure 6</p>
                  </title>
                  <caption>
                     <p>Multiple alignment of the binding site regions across PKA orthologue sequences</p>
                  </caption>
                  <text>
                     <p><b>Multiple alignment of the binding site regions across PKA orthologue sequences</b>. Starting with the mouse sequence (accession NP_032880) of the protein in the crystal structure 1JLU [92], we searched for orthologues of the catalytic subunit of PKA with the ANNOTATOR suite [45]. In the alignment (generated with T-COFFEE [94]), we present 40 variants thereof ranging as far as from yeast to human (sequence position numbering is without leading methionines according to the 1.29 &#197; rule [56,57]). The figure focuses on the protein polypeptide stretch that encompasses the residues forming the surface of the binding site at substrate position from -3 to +1. Red triangles (at Glu127, Glu170 and Glu230 in the numbering of 1JLU without the leading methionine in NP_032880) mark positions that form the pocket for substrate residues -3 and -2. Blue triangles (at Leu198, Pro202 and Leu205) mark the hydrophobic pocket-forming positions that accept substrate residue +1 [4&#8211;7].</p>
                  </text>
                  <graphic file="1745-6150-2-1-6"/>
               </fig>
            </sec>
            <sec>
               <st>
                  <p>Predictor description and the self-consistency test</p>
               </st>
               <p>The motif structure that was presented in the preceding sections served as a basis for the generation of a prediction tool. The final version of the predictor, called "pkaPS", uses one profile over 13 sequence positions and 14 physico-chemical property terms. In the self-consistency test, the pkaPS predictor generates scores S &#8805;	 0 for 236 out of 239 (98.7%) positive examples from the learning set, and, thus, correctly predicts these sequences as potential substrates for PKA-dependent phosphorylation.</p>
               <p>The three entries that are not predicted are summarized in Table <tblr tid="T1">1</tblr>. Although all of them produce profile scores S<sub>profile </sub>above zero, the three database sites (i) Ser10 from the rat brain myelin basic protein <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>, (ii) Ser356 from the rat liver fructose-1,6-bisphosphatase <abbrgrp><abbr bid="B38">38</abbr></abbrgrp> and (iii) Ser197 from human cyclin C1 <abbrgrp><abbr bid="B39">39</abbr></abbrgrp> obviously differ from the consensus represented by the scoring function to a considerable extent. Among other unmet requirements, the charge pattern on the N-terminal side of the reported phosphorylation sites is deviant. Typically, positive charged residues are observed and negative charges are absent. Actually, none of these sites harbors an arginine at either of the important positions -3 or -2. Therefore and given the current knowledge on substrate binding, it is difficult to imagine how these targets fit into the binding site of PKA. Since experimental protocols for determining phosphorylation sites are non-trivial and the reports are of considerable age, an experimental re-examination these cases would be advisable.</p>
               <tbl id="T1">
                  <title>
                     <p>Table 1</p>
                  </title>
                  <caption>
                     <p>Results of the self-consistency test</p>
                  </caption>
                  <tblbdy cols="7">
                     <r>
                        <c ca="center">
                           <p>Score</p>
                        </c>
                        <c ca="center">
                           <p>Profile</p>
                        </c>
                        <c ca="center">
                           <p>
                              <it>T</it>
                              <sub>
                                 <it>j</it>
                              </sub>
                           </p>
                        </c>
                        <c ca="center">
                           <p>Access.</p>
                        </c>
                        <c ca="center">
                           <p>Site</p>
                        </c>
                        <c ca="center">
                           <p>Sequence</p>
                        </c>
                        <c ca="center">
                           <p>Reference</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="7">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.381</p>
                        </c>
                        <c ca="center">
                           <p>1.114</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>1</it></sub>,<it>T</it><sub><it>2</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P02687</p>
                        </c>
                        <c ca="center">
                           <p>10</p>
                        </c>
                        <c ca="center">
                           <p>---------AAQKRPSQR<ul>S</ul>KYLASASTMDHARHGFLPRHRDT</p>
                        </c>
                        <c ca="center">
                           <p>Kishimoto <it>et al</it>. 1985 [37]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-1.338</p>
                        </c>
                        <c ca="center">
                           <p>0.495</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>1</it></sub>-<it>T</it><sub><it>4</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P19112</p>
                        </c>
                        <c ca="center">
                           <p>356</p>
                        </c>
                        <c ca="center">
                           <p>SRPSLPLPQSRARESPVH<ul>S</ul>ICDELF-----------------</p>
                        </c>
                        <c ca="center">
                           <p>Ekdahl 1987 [38]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-1.368</p>
                        </c>
                        <c ca="center">
                           <p>0.069</p>
                        </c>
                        <c ca="center">
                           <p>many</p>
                        </c>
                        <c ca="center">
                           <p>P24385</p>
                        </c>
                        <c ca="center">
                           <p>197</p>
                        </c>
                        <c ca="center">
                           <p>RKHAQTFVALCATDVKFI<ul>S</ul>NPPSMVAAGSVVAAVQGLNLRSP</p>
                        </c>
                        <c ca="center">
                           <p>Sewing &amp; M&#252;ller 1994 [39]</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>Phosphorylated sites from the learning set that are not predicted by the normally parameterized pkaPS predictor (false-negatives in the self-consistency test). The listed penalties <it>T</it><sub><it>j </it></sub>are the terms which make the highest contributions to the negative overall scores.</p>
                  </tblfn>
               </tbl>
               <p>The expected rate of false-positive predictions can directly be estimated using the set of 1026 negative examples. For a given serine or threonine residue of a query sequence, the probability of true-negative prediction lies at 93.5% (F<sub>p</sub>-rate of 6.5%). This set was used to generate an empirical score distribution of negative examples. In order to obtain a value for the false-positive rate for any generated score S, an analytical approximation of this score distribution was determined (Materials and Methods).</p>
            </sec>
            <sec>
               <st>
                  <p>Neighbor-jackknife test</p>
               </st>
               <p>Thorough cross-validation tests are needed in order to assess whether the score function is stably parameterized by the learning set. The pkaPS tool was subjected to a strict cross-validation test where the query sequence in addition to sequences which share more than 30% of identical amino acids with the query were excluded from the parameterization procedure (neighbor-jackknife test, Materials and Methods).</p>
               <p>As summarized in Table <tblr tid="T2">2</tblr>, 10 out of the 239 (4.2%) sites from the learning set were not predicted by pkaPS. As expected, the entries that were not predicted in the self-consistency test were not recognized in the cross-validation test either. In this test, two entries (Q13002, position 697 and P24385, position 197) had profile scores below zero. Therefore, we think that the learning set is still a little bit too small to stably determine the profile.</p>
               <tbl id="T2">
                  <title>
                     <p>Table 2</p>
                  </title>
                  <caption>
                     <p>Results of the neighbor-jackknife test</p>
                  </caption>
                  <tblbdy cols="7">
                     <r>
                        <c ca="center">
                           <p>Score</p>
                        </c>
                        <c ca="center">
                           <p>Profile</p>
                        </c>
                        <c ca="center">
                           <p>T<sub>j</sub></p>
                        </c>
                        <c ca="center">
                           <p>Access.</p>
                        </c>
                        <c ca="center">
                           <p>Site</p>
                        </c>
                        <c ca="center">
                           <p>Sequence</p>
                        </c>
                        <c ca="center">
                           <p>Reference</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="7">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.108</p>
                        </c>
                        <c ca="center">
                           <p>0.268</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>7</it></sub>,<it>T</it><sub><it>8</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P02687</p>
                        </c>
                        <c ca="center">
                           <p>33</p>
                        </c>
                        <c ca="center">
                           <p>SASTMDHARHGFLPRHRD<ul>T</ul>GILDSLGRFFGSDRGAPKRGSGK</p>
                        </c>
                        <c ca="center">
                           <p>Kishimoto <it>et al</it>. 1985 [37]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.158</p>
                        </c>
                        <c ca="center">
                           <p>0.733</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>2</it></sub>,<it>T</it><sub><it>4</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P12336</p>
                        </c>
                        <c ca="center">
                           <p>489</p>
                        </c>
                        <c ca="center">
                           <p>VLVFTLFTFFKVPETKGK<ul>S</ul>FDEIAAEFRKKSGSAPPRKATVQ</p>
                        </c>
                        <c ca="center">
                           <p>Thorens <it>et al</it>. 1996 [61]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.174</p>
                        </c>
                        <c ca="center">
                           <p>0.560</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>4</it></sub>,<it>T</it><sub><it>7</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P24155</p>
                        </c>
                        <c ca="center">
                           <p>643</p>
                        </c>
                        <c ca="center">
                           <p>RFKQEGVLSPKVGMDYRT<ul>S</ul>ILRPGGSEDASTMLKQFLGRDPK</p>
                        </c>
                        <c ca="center">
                           <p>Tullai <it>et al</it>. 2000 [69]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.264</p>
                        </c>
                        <c ca="center">
                           <p>0.145</p>
                        </c>
                        <c ca="center">
                           <p>
                              <it>T</it>
                              <sub>
                                 <it>2</it>
                              </sub>
                           </p>
                        </c>
                        <c ca="center">
                           <p>P02643</p>
                        </c>
                        <c ca="center">
                           <p>19</p>
                        </c>
                        <c ca="center">
                           <p>GDEEKRNRAITARRQHLK<ul>S</ul>VMLQIAATELEKEEGRREAEKQN</p>
                        </c>
                        <c ca="center">
                           <p>Huang <it>et al</it>. 1974 [82]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.367</p>
                        </c>
                        <c ca="center">
                           <p>0.113</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>2</it></sub>,<it>T</it><sub><it>10</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>Q07954</p>
                        </c>
                        <c ca="center">
                           <p>4517</p>
                        </c>
                        <c ca="center">
                           <p>PTNFTNPVYATLYMGGHG<ul>S</ul>RHSLASTDEKRELLGRGPEDEIG</p>
                        </c>
                        <c ca="center">
                           <p>Li <it>et al</it>. 2001 [83]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.469</p>
                        </c>
                        <c ca="center">
                           <p>0.564</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>4</it></sub>,<it>T</it><sub><it>12</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P25961</p>
                        </c>
                        <c ca="center">
                           <p>473</p>
                        </c>
                        <c ca="center">
                           <p>VAIIYCFCNGEVQAEIRK<ul>S</ul>WSRWTLALDFKRKARSGSSSYSY</p>
                        </c>
                        <c ca="center">
                           <p>Blind <it>et al</it>. 1996 [62]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.469</p>
                        </c>
                        <c ca="center">
                           <p>-0.028</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>2</it></sub>,<it>T</it><sub><it>12</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>Q13002</p>
                        </c>
                        <c ca="center">
                           <p>697</p>
                        </c>
                        <c ca="center">
                           <p>KIEYGAVEDGATMTFFKK<ul>S</ul>KISTYDKMWAFMSSRRQSVLVKS</p>
                        </c>
                        <c ca="center">
                           <p>Wang <it>et al</it>. 1993 [84]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-0.619</p>
                        </c>
                        <c ca="center">
                           <p>0.891</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>1</it></sub>,<it>T</it><sub><it>2</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P02687</p>
                        </c>
                        <c ca="center">
                           <p>10</p>
                        </c>
                        <c ca="center">
                           <p>---------AAQKRPSQR<ul>S</ul>KYLASASTMDHARHGFLPRHRDT</p>
                        </c>
                        <c ca="center">
                           <p>Kishimoto <it>et al</it>. 1985 [37]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-1.813</p>
                        </c>
                        <c ca="center">
                           <p>0.038</p>
                        </c>
                        <c ca="center">
                           <p><it>T</it><sub><it>1</it></sub>-<it>T</it><sub><it>4</it></sub></p>
                        </c>
                        <c ca="center">
                           <p>P19112</p>
                        </c>
                        <c ca="center">
                           <p>356</p>
                        </c>
                        <c ca="center">
                           <p>SRPSLPLPQSRARESPVH<ul>S</ul>ICDELF-----------------</p>
                        </c>
                        <c ca="center">
                           <p>Ekdahl 1987 [38]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>-1.867</p>
                        </c>
                        <c ca="center">
                           <p>-0.409</p>
                        </c>
                        <c ca="center">
                           <p>many</p>
                        </c>
                        <c ca="center">
                           <p>P24385</p>
                        </c>
                        <c ca="center">
                           <p>197</p>
                        </c>
                        <c ca="center">
                           <p>RKHAQTFVALCATDVKFI<ul>S</ul>NPPSMVAAGSVVAAVQGLNLRSP</p>
                        </c>
                        <c ca="center">
                           <p>Sewing &amp; M&#252;ller 1994 [39]</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>Phosphorylated sites from the learning set that are not predicted by pkaPS in the neighbor-jackknife test. The three entries with the lowest scores are not predicted in the self-consistency test either (Table 1). The listed penalties <it>T</it><sub><it>j </it></sub>are the terms which make the highest contributions to the negative overall scores.</p>
                  </tblfn>
               </tbl>
               <p>All of the seven entries that were predicted in the self-consistency test but not in the neighbor-jackknife test have only marginally negative scores between zero and -0.5. Only the three entries that had scores below zero in the self-consistency test also had an S &lt; -0.5 in the neighbor-jackknife test (see Tables <tblr tid="T1">1</tblr> and <tblr tid="T2">2</tblr>). We think that the score interval between 0.0 and -0.5 represents a twilight zone.</p>
            </sec>
            <sec>
               <st>
                  <p>Summary of the prediction performance and comparison to other tools</p>
               </st>
               <p>To conclude, the prediction performance of the pkaPS tool is considerable. Its sensitivity lies in the range of at least 95.8% as estimated from the neighbor jackknife test, and is as high as 98.7% in the self-consistency test. At the same time, a specificity of 93.5% could be achieved.</p>
               <p>The pkaPS tool was compared to a set of currently available phosphorylation predictors (Table <tblr tid="T3">3</tblr>). Unfortunately, a comparison of these tools on the same test set was impossible due to the lack of publicly available untrained versions of the tools that could be used for cross-validation tests. Hence, the comparisons were based on sensitivity and specificity values which were taken from the original papers. The performance values from older publications do not straightforwardly compare with those in this work. In general, prediction methods can be expected to show decreased performance when tested on enlarged, more recent and diverse sequence sets. More importantly, the accuracy measured for a prediction tool is also influenced by the rigor of the cross-validation test. This type of test should determine how predictors perform on query sequences that are dissimilar to the learning set examples. In our work, a strict method has been applied, the neighbor-jackknife test. In the leave-one-out procedure, we excluded not only the entry under consideration but also all sequentially similar examples.</p>
               <tbl id="T3">
                  <title>
                     <p>Table 3</p>
                  </title>
                  <caption>
                     <p>Prediction performances of available algorithms compared to pkaPS.</p>
                  </caption>
                  <tblbdy cols="4">
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c cspan="2" ca="center">
                           <p>Prediction performance</p>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c>
                           <p/>
                        </c>
                        <c cspan="2">
                           <hr/>
                        </c>
                        <c>
                           <p/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>Algorithm</p>
                        </c>
                        <c ca="center">
                           <p><it>S</it><sub><it>n </it></sub>[%]</p>
                        </c>
                        <c ca="center">
                           <p><it>S</it><sub><it>p </it></sub>[%]</p>
                        </c>
                        <c ca="center">
                           <p>Reference</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="4">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>DISPHOS</p>
                        </c>
                        <c ca="center">
                           <p>ca. 76</p>
                        </c>
                        <c ca="center">
                           <p>ca. 85</p>
                        </c>
                        <c ca="center">
                           <p>Iakoucheva <it>et al</it>. 2004 [27]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>SCANSITE</p>
                        </c>
                        <c ca="center">
                           <p>70.7</p>
                        </c>
                        <c ca="center">
                           <p>92.9</p>
                        </c>
                        <c ca="center">
                           <p>Zhou <it>et al</it>. 2004 [15]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>NETPHOSK</p>
                        </c>
                        <c ca="center">
                           <p>79</p>
                        </c>
                        <c ca="center">
                           <p>89</p>
                        </c>
                        <c ca="center">
                           <p>Blom <it>et al</it>. 2004 [12]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>GPS</p>
                        </c>
                        <c ca="center">
                           <p>88.9</p>
                        </c>
                        <c ca="center">
                           <p>90.6</p>
                        </c>
                        <c ca="center">
                           <p>Xue <it>et al</it>. 2005 [16]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>PREDPHOSPHO</p>
                        </c>
                        <c ca="center">
                           <p>88.3</p>
                        </c>
                        <c ca="center">
                           <p>91.1</p>
                        </c>
                        <c ca="center">
                           <p>Kim <it>et al</it>. 2004 [14]</p>
                        </c>
                     </r>
                     <r>
                        <c ca="left">
                           <p>pkaPS</p>
                        </c>
                        <c ca="center">
                           <p>95.8</p>
                        </c>
                        <c ca="center">
                           <p>93.5</p>
                        </c>
                        <c ca="center">
                           <p>this work</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>The table shows the prediction performances of five other programs compared to the worst performance (<it>S</it><sub><it>n </it></sub>from neighbor jackknife-test) of the pkaPS predictor. All listed values except for those from DISPHOS refer to PKA-specific versions of the prediction tools. The pkaPS program outperforms all currently available methods that can be used to detect PKA-dependent phosphorylation. The sensitivities (<it>S</it><sub><it>n</it></sub>) and specificities (<it>S</it><sub><it>p</it></sub>) were directly taken from the original papers. Iakoucheva <it>et al</it>. [27] provide two possible values for the specificity as performance measure for the DISPHOS predictor, one that takes into account the possible occurrence of noise in the negative learning set (higher <it>S</it><sub><it>p</it></sub>) and one that does not (lower <it>S</it><sub><it>p</it></sub>). For DISPHOS, the <it>S</it><sub><it>n </it></sub>value of 76% for serine (none is mentioned for threonine) and the higher, estimated specificity value of 85% (for serine) were used. The prediction performance for the SCANSITE program was not taken from the corresponding publication (Obenauer <it>et al</it>. 2003 [13]) but from Table 2 in the GPS paper [15]. The reason is that no evaluation of the SCANSITE performance in detecting sites for phosphorylation by PKA could be found in the paper from Obenauer et al. 2003 [13]. Here, the values for the low stringency cut-off were taken.</p>
                  </tblfn>
               </tbl>
               <p>As judged from the published predictor performance ratings, pkaPS provides a better specificity and sensitivity than all other currently available tools. Among these methods, only DISPHOS <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> is a predictor for "average phosphorylation" sites without considering kinase specificity. All other tools have implementations for specific kinases including PKA. The algorithms that come closest to the performance of pkaPS are PREDPHOSPHO, which uses a support vector machine based implementation <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, and GPS, which rests upon a group-based scoring method <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Prediction of PKA targets within the human proteome</p>
               </st>
               <p>In addition to thorough cross-validation tests, the performance of the pkaPS tool was studied by analyzing predicted PKA-dependent phosphorylation sites in the human proteome. The human protein sequences were retrieved from the NCBI FTP-site (40877 sequences, September 14<sup>th </sup>2006 at <abbrgrp><abbr bid="B40">40</abbr></abbrgrp>). From a total of 2,485,866 serines and threonines, 258,271 (10.4%) were predicted as putative phosphorylation sites with scores <it>S </it>> 0. In our understanding, the list of predicted sites contains (a) true phosphorylation PKA sites, (b) sites that are phosphorylated <it>in vitro </it>by PKA but not <it>in vivo </it>due to the absence of biological context (see the comment on hidden signals in the discussion and in ref. <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>) and (c) real false-positive predictions of phosphorylated serines/threonines that are in sequence stretches without capability of productive interaction with the catalytic site of PKA. The consideration of additional functional sequence regions has a significant impact on the rate of predicted PKA-dependent phosphorylation sites. For example, there are 649979 ST-sites in 10195 proteins with predicted signal peptides (with any of the taxonomic versions of SIGNALP 3.0 <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>). pkaPS generates hits for 56970 sites (8.8%), a considerably lower value than that for the full proteome.</p>
               <p>Proteins with many serines/threonines are likely to have multiple PKA-dependent phosphorylation site predictions. For the human proteome, we find that the more sites are predicted per proteins, the smaller the mean distance between them (Figure <figr fid="F7">7</figr>). This confirms the trends observed in Figure <figr fid="F2">2</figr> that proteins with many phosphorylation sites tend to pack these closely together into unified serine/threonine-rich regions.</p>
               <fig id="F7">
                  <title>
                     <p>Figure 7</p>
                  </title>
                  <caption>
                     <p>Mean distances between pairs of neighboring predicted sites depending on the total number of predicted sites in the query proteins</p>
                  </caption>
                  <text>
                     <p><b>Mean distances between pairs of neighboring predicted sites depending on the total number of predicted sites in the query proteins</b>. The red line displays the linear regression (y = 59.3 - 0.271x; R = -0.66) calculated using these data points.</p>
                  </text>
                  <graphic file="1745-6150-2-1-7"/>
               </fig>
               <p>The probability of wrongly predicting a site within a generally non-phosphorylated protein appears to dramatically increase with the number of S/T sites in its sequence. Among the 40877 sequences that are included in the retrieved file, only 4860 entries (11.9%) do not have a single predicted site for PKA-dependent phosphorylation. This result strongly emphasizes the main difficulty of predicting post-translational modifications that can occur in a query protein with multiple suitable serines/threonines. In such cases, a single false-positive site may be responsible for an incorrect assignment of the entire protein.</p>
               <p>An increase in prediction accuracy may be obtained by grouping predicted entries together to clusters of related sequences. Naturally, predictions for a single post-translational modification can be considered more reliable if they are frequently observed in a protein family as opposed to a lone protein sequence <abbrgrp><abbr bid="B43">43</abbr></abbrgrp> although this trend is not absolute even for closely related homologues <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B44">44</abbr></abbrgrp>. To test the pkaPS tool on families of homologous sequences, the human proteome was clustered into 14674 groups with the MCL algorithm <abbrgrp><abbr bid="B45">45</abbr></abbrgrp> (as implemented within the ANNOTATOR suite <abbrgrp><abbr bid="B46">46</abbr></abbrgrp>) and each group was analyzed with the predictor separately. Many of these clusters only contained a single or a few sequences. All groups with less than 20 entries were removed from further considerations. The remaining 182 clusters were sorted according to the ratio between predicted serines/threonines and the total number of these residues in the cluster. The 20 clusters with the highest ratio are listed in Table <tblr tid="T4">4</tblr>. The good performance of the predictor is supported by the fact that families of proteins known to be good PKA targets occupy the top ranks in this listing. In addition to known phosphorylated sequence classes (e.g. histone H2A), there are also entirely uncharacterized groups of proteins that deserve experimental analysis. It is remarkable that this list does not contain any obviously false-positive sequence family.</p>
               <tbl id="T4">
                  <title>
                     <p>Table 4</p>
                  </title>
                  <caption>
                     <p>Prediction of the clustered human proteome.</p>
                  </caption>
                  <tblbdy cols="7">
                     <r>
                        <c ca="center">
                           <p>Cluster number</p>
                        </c>
                        <c ca="center">
                           <p>Total entries</p>
                        </c>
                        <c ca="center">
                           <p>Predicted S/T (%)</p>
                        </c>
                        <c ca="center">
                           <p>Predicted S/T</p>
                        </c>
                        <c ca="center">
                           <p>Predicted entries (%)</p>
                        </c>
                        <c ca="center">
                           <p>Predicted entries</p>
                        </c>
                        <c ca="center">
                           <p>Common domains/superfamily</p>
                        </c>
                     </r>
                     <r>
                        <c cspan="7">
                           <hr/>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>171</p>
                        </c>
                        <c ca="center">
                           <p>21</p>
                        </c>
                        <c ca="center">
                           <p>39.7</p>
                        </c>
                        <c ca="center">
                           <p>52</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>21</p>
                        </c>
                        <c ca="center">
                           <p>High mobility group</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>88</p>
                        </c>
                        <c ca="center">
                           <p>32</p>
                        </c>
                        <c ca="center">
                           <p>36.2</p>
                        </c>
                        <c ca="center">
                           <p>141</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>32</p>
                        </c>
                        <c ca="center">
                           <p>Histone H2A</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>43</p>
                        </c>
                        <c ca="center">
                           <p>48</p>
                        </c>
                        <c ca="center">
                           <p>30.8</p>
                        </c>
                        <c ca="center">
                           <p>518</p>
                        </c>
                        <c ca="center">
                           <p>91.7</p>
                        </c>
                        <c ca="center">
                           <p>44</p>
                        </c>
                        <c ca="center">
                           <p>Splicing factor, predicted RNA-binding</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>84</p>
                        </c>
                        <c ca="center">
                           <p>34</p>
                        </c>
                        <c ca="center">
                           <p>29</p>
                        </c>
                        <c ca="center">
                           <p>249</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>34</p>
                        </c>
                        <c ca="center">
                           <p>TAFII28-like</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>109</p>
                        </c>
                        <c ca="center">
                           <p>27</p>
                        </c>
                        <c ca="center">
                           <p>27.3</p>
                        </c>
                        <c ca="center">
                           <p>81</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>27</p>
                        </c>
                        <c ca="center">
                           <p>Unknown</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>72</p>
                        </c>
                        <c ca="center">
                           <p>36</p>
                        </c>
                        <c ca="center">
                           <p>26.4</p>
                        </c>
                        <c ca="center">
                           <p>101</p>
                        </c>
                        <c ca="center">
                           <p>94.4</p>
                        </c>
                        <c ca="center">
                           <p>34</p>
                        </c>
                        <c ca="center">
                           <p>GAGE protein</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>131</p>
                        </c>
                        <c ca="center">
                           <p>24</p>
                        </c>
                        <c ca="center">
                           <p>24.7</p>
                        </c>
                        <c ca="center">
                           <p>80</p>
                        </c>
                        <c ca="center">
                           <p>87.5</p>
                        </c>
                        <c ca="center">
                           <p>21</p>
                        </c>
                        <c ca="center">
                           <p>Ribosomal protein L21e</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>123</p>
                        </c>
                        <c ca="center">
                           <p>25</p>
                        </c>
                        <c ca="center">
                           <p>22.8</p>
                        </c>
                        <c ca="center">
                           <p>146</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>25</p>
                        </c>
                        <c ca="center">
                           <p>Ribosomal protein L22</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>172</p>
                        </c>
                        <c ca="center">
                           <p>21</p>
                        </c>
                        <c ca="center">
                           <p>22.5</p>
                        </c>
                        <c ca="center">
                           <p>108</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>21</p>
                        </c>
                        <c ca="center">
                           <p>Histone H2B</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>33</p>
                        </c>
                        <c ca="center">
                           <p>59</p>
                        </c>
                        <c ca="center">
                           <p>21.6</p>
                        </c>
                        <c ca="center">
                           <p>390</p>
                        </c>
                        <c ca="center">
                           <p>98.3</p>
                        </c>
                        <c ca="center">
                           <p>58</p>
                        </c>
                        <c ca="center">
                           <p>Cyclophilin-like</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>94</p>
                        </c>
                        <c ca="center">
                           <p>30</p>
                        </c>
                        <c ca="center">
                           <p>21.1</p>
                        </c>
                        <c ca="center">
                           <p>111</p>
                        </c>
                        <c ca="center">
                           <p>93.3</p>
                        </c>
                        <c ca="center">
                           <p>28</p>
                        </c>
                        <c ca="center">
                           <p>Histones H3/H4</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>105</p>
                        </c>
                        <c ca="center">
                           <p>27</p>
                        </c>
                        <c ca="center">
                           <p>20.9</p>
                        </c>
                        <c ca="center">
                           <p>88</p>
                        </c>
                        <c ca="center">
                           <p>96.3</p>
                        </c>
                        <c ca="center">
                           <p>26</p>
                        </c>
                        <c ca="center">
                           <p>KRAB domain</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>7</p>
                        </c>
                        <c ca="center">
                           <p>136</p>
                        </c>
                        <c ca="center">
                           <p>20</p>
                        </c>
                        <c ca="center">
                           <p>1300</p>
                        </c>
                        <c ca="center">
                           <p>96.3</p>
                        </c>
                        <c ca="center">
                           <p>131</p>
                        </c>
                        <c ca="center">
                           <p>GTPase-activating protein</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>10</p>
                        </c>
                        <c ca="center">
                           <p>123</p>
                        </c>
                        <c ca="center">
                           <p>19.7</p>
                        </c>
                        <c ca="center">
                           <p>412</p>
                        </c>
                        <c ca="center">
                           <p>84.6</p>
                        </c>
                        <c ca="center">
                           <p>104</p>
                        </c>
                        <c ca="center">
                           <p>Unknown</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>115</p>
                        </c>
                        <c ca="center">
                           <p>26</p>
                        </c>
                        <c ca="center">
                           <p>19.6</p>
                        </c>
                        <c ca="center">
                           <p>126</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>26</p>
                        </c>
                        <c ca="center">
                           <p>60S ribosomal protein L6</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>154</p>
                        </c>
                        <c ca="center">
                           <p>22</p>
                        </c>
                        <c ca="center">
                           <p>18.7</p>
                        </c>
                        <c ca="center">
                           <p>94</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>22</p>
                        </c>
                        <c ca="center">
                           <p>HIV-1 Vpr-binding, High mobility group</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>79</p>
                        </c>
                        <c ca="center">
                           <p>35</p>
                        </c>
                        <c ca="center">
                           <p>17.5</p>
                        </c>
                        <c ca="center">
                           <p>206</p>
                        </c>
                        <c ca="center">
                           <p>91.4</p>
                        </c>
                        <c ca="center">
                           <p>32</p>
                        </c>
                        <c ca="center">
                           <p>Ras GTPase-activating protein</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>38</p>
                        </c>
                        <c ca="center">
                           <p>52</p>
                        </c>
                        <c ca="center">
                           <p>17.4</p>
                        </c>
                        <c ca="center">
                           <p>287</p>
                        </c>
                        <c ca="center">
                           <p>92.3</p>
                        </c>
                        <c ca="center">
                           <p>48</p>
                        </c>
                        <c ca="center">
                           <p>Septins</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>153</p>
                        </c>
                        <c ca="center">
                           <p>22</p>
                        </c>
                        <c ca="center">
                           <p>16.3</p>
                        </c>
                        <c ca="center">
                           <p>105</p>
                        </c>
                        <c ca="center">
                           <p>77.3</p>
                        </c>
                        <c ca="center">
                           <p>17</p>
                        </c>
                        <c ca="center">
                           <p>RNA-binding protein TIA-1/TIAR (RRM superfamily)</p>
                        </c>
                     </r>
                     <r>
                        <c ca="center">
                           <p>160</p>
                        </c>
                        <c ca="center">
                           <p>22</p>
                        </c>
                        <c ca="center">
                           <p>16.1</p>
                        </c>
                        <c ca="center">
                           <p>465</p>
                        </c>
                        <c ca="center">
                           <p>100</p>
                        </c>
                        <c ca="center">
                           <p>22</p>
                        </c>
                        <c ca="center">
                           <p>Uncharacterized conserved protein (KOG4791)</p>
                        </c>
                     </r>
                  </tblbdy>
                  <tblfn>
                     <p>The table shows the 20 clusters with the best ratio between predicted and total S/T-sites. The column to the right displays a summary of the cluster with respect to the most common domains found using the CDD [85] and PFAM [86] databases (e-value cutoff of 0.01). Lower cluster numbers indicate clusters with more sequences included.</p>
                  </tblfn>
               </tbl>
               <p>The prediction of phosphorylation sites in ribosomal proteins such as L21e, L22 and L6 deserves special attention in context with the recent discovery of phosphorylation of some ribosomal proteins by specific kinases (such as the ribosomal protein S6 kinase (S6K)) and the important biological role of this phosphorylation <abbrgrp><abbr bid="B47">47</abbr><abbr bid="B48">48</abbr></abbrgrp>.</p>
            </sec>
            <sec>
               <st>
                  <p>Description of the associated WWW site</p>
               </st>
               <p>Supplementary data as well as the pkaPS WWW-server are available at the Mendel WWW-site <abbrgrp><abbr bid="B49">49</abbr></abbrgrp>. The pkaPS server currently accepts up to 500 sequences in fasta-format (with no more 10000 S/T residues). For analyzing larger sets, we recommend contacting the authors. In interpreting the results, we advise to consider scores above 0 as good predictions; the twilight zone limit is -0.5. The predictor pkaPS analyzes the capability of the query sequence to productively interact with PKA. Additional information should be gathered from the literature or from predictors for other sequence properties to decide whether the prediction is not a hidden signal and makes sense in the biological context of the query. Additionally, we provide (i) access to the learning set, (ii) detailed results of the self-consistency and the neighbor-jackknife tests, (iii) downloads of the predictions for the human proteome both in plain and MCL-clustered forms <abbrgrp><abbr bid="B45">45</abbr></abbrgrp>.</p>
            </sec>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Despite considerable algorithmic advances in the field, none of the prediction tools for PKA-dependent phosphorylation previously described in the literature achieves specificity and sensitivity rates both above 90%. In our view, several biological and computational aspects contribute to this development. Among them, there are several problems: (a) with the availability of experimental data, (b) with serine/threonine-rich regions, (c) with the incorporation of the available physico-chemical and biological knowledge into the scoring function used to discriminate between productively interacting substrates from non-permissive sequence stretches, (d) with the issue of accessibility of the phosphorylation motif within the protein's three-dimensional structure (intrinsically disordered regions surrounding the site) <abbrgrp><abbr bid="B27">27</abbr><abbr bid="B50">50</abbr></abbrgrp> and (e) with the issue of hidden signals (proteins that would be phosphorylated if they were in contact with PKA but which never have the appropriate biological context during their life cycle) <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>.</p>
         <p>With regard to point (a), little experimental data is available for most kinases with regard to the sequence variability of substrates, structural detail of kinase-substrate complexes, kinetic or energetic aspects of the interaction. Only a few kinases including PKA are reasonably well studied in this respect. For example, the number of sequentially dissimilar substrate sequences for the reliably parameterization of the profile term was estimated at least 200 in <abbrgrp><abbr bid="B50">50</abbr></abbrgrp>. This number is reached for PKA even in the neighbor jackknife test (among the 239 sequence examples, the number of excluded sequences has never been above 10) and the results of this test show that stable profile parameterization has almost been achieved.</p>
         <p>The issue of many serines/threonines in the sequence (b) is especially challenging since ST-rich regions are common in intra- and extracellular proteins. To detect phosphorylated proteins on a large-scale basis, every single potential site in a sequence must be taken into consideration. If <b><it>S</it></b><sub><b><it>p </it></b></sub>(measured as value between 0 and 1) is the rate of correct rejection of a site and if there are <it>n </it>serine/threonine residues in a query sequence, the specificity of the task for classification of query proteins decreases significantly (to <m:math name="1745-6150-2-1-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msubsup><m:mi>S</m:mi><m:mi>p</m:mi><m:mi>n</m:mi></m:msubsup></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFtbWudaqhaaWcbaGae8hCaahabaGae8NBa4gaaaaa@30D6@</m:annotation></m:semantics></m:math> &lt;&lt; 1). Considering the difficulties associated with prediction of potentially multiple sites in ST-rich regions, it is clear that very high accuracies are needed if such algorithms are to be applied routinely on a large-scale proteome basis.</p>
         <p>The incorporation of the heterogeneous knowledge about the PKA-substrate protein relationship into the scoring function (issue c) is a non-trivial problem. Experimental reports usually do not provide the knowledge in the form that is necessary for formulating algorithms. There are two ways to deal with this problem &#8211; either to take the information as is and to hope that machine learning procedures filter the aspects of the data that are relevant for prediction, or to formulate a physically reasonable model of productive binding events with the kinase directly. Machine learning approaches have shown their usability in a variety of applications, especially in cases where lots of uniform data are available. The classical example is SIGNALP, for the derivation of which learning sets in the size of thousands of substrate proteins were collected <abbrgrp><abbr bid="B42">42</abbr></abbrgrp>.</p>
         <p>In many other biological applications, the data situation is by far not that comfortable. In such circumstances, the usage of machine learning software packages as "black boxes" for autonomous extraction of score function parameters without human interference and without explicitly considering the physico-chemical and biological realities of the problem under study can become dangerous. In their letter to the editors of the Biophysics Journal in 1994 <abbrgrp><abbr bid="B51">51</abbr></abbrgrp>, Frank Darius and Raul Rojas analyze the difficulties arising from the discrepancy between the very high dimension of the parameter space in modern machine learning approaches and scarce data when exemplarily criticizing an alternative signalpeptide predictor. To summarize, the problem is that the calculated parameters are not reliable and it is not clear whether the correlations found are numerical noise of the data or biologically meaningful. If the data are scarce and no one tells the "black box" how the substrate protein interacts with the receptor, then the box would indeed need to be a "magic box" to know about it in order to pick the correct significant parameters.</p>
         <p>As a direct consequence, human involvement and additional biological knowledge are indispensable for dimensionality reduction. In contrast to machine learning approaches, a physically justified model of productive binding with the kinase already provides a reasonable analytical form of scoring function terms. In this context, it is not so important whether this form can be further improved. We wish to emphasize that the number of parameters to be determined with the help of learning data is dramatically reduced.</p>
         <p>In our approach, we consequently follow these considerations and try to incorporate all biologically relevant information into the analytical form of the prediction function. For example, it is utterly important that among all sequence positions, which carry relevant information, as many as possible are considered for the prediction procedure. In the case of PKA, we found the region to occupy the segment -18...+23. Also, we have very few parameters: a profile term over 13 positions centered around the putative site and 14 physical property terms <it>T</it><sub><it>j </it></sub>(typically involving 3 parameters: the mean and the standard deviation of an amino acid index averaged over some sequence region as well as a weight factor for the whole term). Especially the latter set of parameters is determined with high significance given the 239 positive examples. We think that the simplicity of our algorithm is its big strength since the output of the decision function clearly indicates what kind of property supports or prevents the prediction of a query as PKA substrate. In the process of predictor development, human interference can assure that only the biologically meaningful among the significant correlations enter the decision function.</p>
         <p>The influence of the structure of the protein on the accessibility of the phosphorylation motif to the kinase (issue d) is difficult to estimate at present. By demanding an excess of hydrophilic, small and flexible residues in the region -18...+23 with the physical property terms, it becomes quite unlikely that a sequence region hit by our predictor is actually part of a 3D structure but rather represents an intrinsically disordered segment. Nevertheless, it cannot be excluded that sequence stretches that are not included in the motif definition might cause the entire protein to fold in such a way that the potential phosphorylation site is not accessible to its modifying kinase.</p>
         <p>Finally, the cellular context is important (issue e). In the case of some translocation signals, it has been experimentally shown that even the presence of an <it>in vivo </it>functional motif does not mean that the carrying protein is also imported into the corresponding subcellular compartment <abbrgrp><abbr bid="B41">41</abbr></abbrgrp>. This discovery of hidden sequence signals highlights the significance of cellular hierarchies for small functional protein motifs. Hence, current phosphorylation predictors including pkaPS do not really predict phosphorylation, but the potential of a sequence stretch to interact productively with the modifying kinase. This seemingly small detail may appear negligible but it is important to be considered in all predictions. E.g. a targeting signal located far away from a functional phosphorylation site on the same protein may lead to a removal from the cellular compartment of the respective kinase, thereby overriding the phosphorylation motif. This means that the analyzed motif is not the sole sequence stretch on the protein which is responsible for the modification.</p>
         <p>These considerations mean that a phosphorylation predictor is not the only source of information that must be consulted when evaluating the phosphorylation state of a protein. Moreover, the number of apparently wrong predictions (if only the physiologically relevant cases are counted) of an algorithm is not only determined by the imperfection of its design since the predictor focuses on the query sequence stretch. Hence, even if all permissive amino acid permutations of the substrate motif are known, the theoretical accuracy of any post-translational modification predictor will have an upper limit clearly below 100%.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>The refinement of the PKA phosphporylation motif showed that approximately 20 sequence positions flanking the phosphorylated residue on both sides are restricted in their sequence variability. The conserved physical pattern can be rationalized in terms of a qualitative binding model with the catalytic cleft of the protein kinase A. The pkaPS predictor based on this motif description confidently discriminates PKA phosphorylation sites from serines/threonines with non-permissive sequence environments (sensitivity of ~96% at a specificity of ~94%).</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Learning set construction</p>
            </st>
            <p>UNIPROT <abbrgrp><abbr bid="B52">52</abbr><abbr bid="B53">53</abbr><abbr bid="B54">54</abbr></abbrgrp> accessions and positions of sites which are phosphorylated via PKA were retrieved from the Phospho.ELM database version 4.0 <abbrgrp><abbr bid="B55">55</abbr></abbrgrp>. Subsequently, an alignment was generated which contains the 81-residue long sequences that span the phosphorylated residue in addition to the 40 flanking amino acids on each side. Positions outside of the N- or C-terminal ends were treated as non-occupied (without amino acid) in further calculations. The original sequences of the substrate proteins were obtained from the UNIPROT database <abbrgrp><abbr bid="B56">56</abbr></abbrgrp>. Initiator methionines were removed according to the 1.29&#197; rule <abbrgrp><abbr bid="B57">57</abbr><abbr bid="B58">58</abbr></abbrgrp>.</p>
            <p>Previous analyses of typical annotation errors in databases <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B20">20</abbr><abbr bid="B22">22</abbr><abbr bid="B42">42</abbr></abbrgrp> emphasized the importance of learning set curation. As expected, a couple of entries were inaccurate or had unclear verifications. Therefore, the following modifications were introduced (protein sequences are indicated by UNIPROT accession numbers):</p>
            <p>&#8226; The first phosphorylation site of P02646 is actually a double site that lies at positions 22/23 instead of position 20 <abbrgrp><abbr bid="B59">59</abbr><abbr bid="B60">60</abbr></abbrgrp>.</p>
            <p>&#8226; Position 137 of the Casein-B precursor sequence (P02666) contains arginine, not serine. According to the paper where the experimental verification is reported <abbrgrp><abbr bid="B61">61</abbr></abbrgrp>, phosphorylation occurs in a "variant B" which contains this mutation.</p>
            <p>&#8226; The experimental verification in the paper cited for the entry P11168 was actually performed for the rat (RINm5F cell line), not the human protein <abbrgrp><abbr bid="B62">62</abbr></abbrgrp>. As a consequence, the entry was replaced by the corresponding rat sequence P12336, with the reported phosphorylation sites located at positions 489, 501, 503 and 510.</p>
            <p>&#8226; The exact localization of the PKA-dependent phosphorylation sites in the PTH/PTHrP type I receptor (P25107) was performed using the rat protein, not the one from opossum <abbrgrp><abbr bid="B63">63</abbr></abbrgrp>. Therefore, P25107 was replaced by P25961 (positions 491, 473 and 475).</p>
            <p>&#8226; Phosphorylation of the sites in P00698 appears to occur only in the denaturized protein <abbrgrp><abbr bid="B64">64</abbr></abbrgrp>. As the experimental verification status of the entry is not entirely clear, it was excluded from the dataset.</p>
            <p>&#8226; Entry P13280 is removed from the dataset because the experimental verification for this extremely unusual motif is unclear <abbrgrp><abbr bid="B65">65</abbr></abbrgrp>.</p>
            <p>&#8226; Serine 259 from P04049 is phosphorylated <abbrgrp><abbr bid="B66">66</abbr></abbrgrp> but not listed in Phosph.ELM and was, therefore, added subsequently.</p>
            <p>&#8226; According to the corresponding paper <abbrgrp><abbr bid="B67">67</abbr></abbrgrp> and the UNIPROT entry, the phosphorylated residue of P20020 lies at position 1216 and not 1178.</p>
            <p>&#8226; Phosphorylation of &#947;-aminobutyric-acid receptor &#946;1 is originally reported for the mouse protein instead of the human counterpart <abbrgrp><abbr bid="B68">68</abbr></abbrgrp>. As a consequence, entry P18505 was replaced by P50571.</p>
            <p>&#8226; The reported phosphorylation sites for P32245 are only proposed to be potential sites for PKA and GRK. They actually lack any direct experimental verification <abbrgrp><abbr bid="B69">69</abbr></abbrgrp> for PKA-dependent phosphorylation. The corresponding entries were removed from the learning set as a consequence.</p>
            <p>&#8226; Phosphorylation of the metallopeptidase EP24.15 is reported for the rat instead of the human protein <abbrgrp><abbr bid="B70">70</abbr></abbrgrp>. Entry P52888 was, therefore, replaced by P24155 including the correct location of the phosphorylated serine.</p>
            <p>&#8226; The phosphorylated serine in P07101 is located at position 71, not 40. Although the original paper reports Ser40 as site <abbrgrp><abbr bid="B71">71</abbr></abbrgrp>, the PKA-motif is shifted in direction of the C-terminus in the UNIPROT entry as a result of additional N-terminal regions from splice variants. Moreover, the experimental verification was performed using the rat protein, and the phosphorylated serine is annotated as "by similarity" in the UNIPROT sequence. Hence, the entry was removed from the learning set.</p>
            <p>&#8226; P01233 can theoretically be phosphorylated at three sites. However, the post-translational modification states depend on whether the implicated &#946;-subunit is free and in its native form <abbrgrp><abbr bid="B72">72</abbr></abbrgrp>. Therefore, the corresponding entry was removed from the learning set.</p>
            <p>To generate a set of negative examples, the references of a set of learning set sequences were screened. If it could be deduced that the phosphorylated S/T-sites reported in a publication were the sole amino acids that are modified by PKA, then all remaining S/T-sites were added to the set of negative examples.</p>
            <p>The final learning sets consist of 143 sequences with 239 phosphorylated sites and 28 sequences with 1026 non-phosphorylated serines and threonines. Although the set of positive examples contains entries from various taxonomical groups, it is mostly centered on mammalian species. Around one half of the 239 sites originate from H. sapiens (120 sites). Together with the other mammalian entries (93 sites), they make up 89% of the learning set. From the remaining entries, 21 originate from other metazoan species, and only 5 are from yeast and viridiplantae.</p>
            <p>It should be noted that phosphorylation frequently occurs at multiple sites of the same substrate protein (in contrast to several other posttranslational modifications) and this is reflected in the learning set. From 143 sequences included in the set of positive examples, more than one third has more than one verified serine or threonine residue. As a consequence, two thirds of the phosphorylated sites in the learning set originate from proteins with multiple modifications. The corresponding distribution of the numbers of phosphorylation sites per protein (shown in Table <tblr tid="T5">5</tblr>) seems to fall exponentially.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Distribution of the number of phosphorylated sites per sequence in the learning set.</p>
               </caption>
               <tblbdy cols="7">
                  <r>
                     <c ca="center">
                        <p>Sites per sequence</p>
                     </c>
                     <c ca="center">
                        <p>N<sub>sequences</sub></p>
                     </c>
                     <c ca="center">
                        <p>%</p>
                     </c>
                     <c ca="center">
                        <p>% (cum.)</p>
                     </c>
                     <c ca="center">
                        <p>N<sub>sites</sub></p>
                     </c>
                     <c ca="center">
                        <p>%</p>
                     </c>
                     <c ca="center">
                        <p>% (cum.)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="center">
                        <p>61.5</p>
                     </c>
                     <c ca="center">
                        <p>61.5</p>
                     </c>
                     <c ca="center">
                        <p>88</p>
                     </c>
                     <c ca="center">
                        <p>36.8</p>
                     </c>
                     <c ca="center">
                        <p>36.8</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>30</p>
                     </c>
                     <c ca="center">
                        <p>21.0</p>
                     </c>
                     <c ca="center">
                        <p>82.5</p>
                     </c>
                     <c ca="center">
                        <p>60</p>
                     </c>
                     <c ca="center">
                        <p>25.1</p>
                     </c>
                     <c ca="center">
                        <p>61.9</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>10.5</p>
                     </c>
                     <c ca="center">
                        <p>93.0</p>
                     </c>
                     <c ca="center">
                        <p>45</p>
                     </c>
                     <c ca="center">
                        <p>18.8</p>
                     </c>
                     <c ca="center">
                        <p>80.7</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>4.9</p>
                     </c>
                     <c ca="center">
                        <p>97.9</p>
                     </c>
                     <c ca="center">
                        <p>28</p>
                     </c>
                     <c ca="center">
                        <p>11.7</p>
                     </c>
                     <c ca="center">
                        <p>92.4</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>98.6</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>2.1</p>
                     </c>
                     <c ca="center">
                        <p>94.5</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>99.3</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>2.5</p>
                     </c>
                     <c ca="center">
                        <p>97.0</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.7</p>
                     </c>
                     <c ca="center">
                        <p>100.0</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>3.0</p>
                     </c>
                     <c ca="center">
                        <p>100.0</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="7">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Total</p>
                     </c>
                     <c ca="center">
                        <p>143</p>
                     </c>
                     <c ca="center">
                        <p>100.0</p>
                     </c>
                     <c ca="center">
                        <p>100.0</p>
                     </c>
                     <c ca="center">
                        <p>239</p>
                     </c>
                     <c ca="center">
                        <p>100.0</p>
                     </c>
                     <c ca="center">
                        <p>100.0</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>Positive examples in the dataset contain up to seven sites per sequence. Although approximately two thirds (88) of all entries (143) contain only one phosphorylated serine or threonine, this accounts for just slightly more than one third of the total number of included sites (239). It should be noted that, for many proteins, additional phosphorylation sites might have been undetected so far.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Sequence analysis part 1. Redundancy removal</p>
            </st>
            <p>The phosphorylated serines/threonines of a learning set together with their flanking sequences are represented in a gapless multiple alignment of n<sub>pos </sub>= 81 positions. The n<sub>seq </sub>= 239 phosphorylated sites occupy a single column that acts as reference location with an assigned position number of 0. Sequence positions further N-terminally have negative values, positions further C-terminally have positive values.</p>
            <p>To remove redundancy from over-represented sequence sets in the learning alignment <abbrgrp><abbr bid="B73">73</abbr></abbrgrp>, we used a technique similar to the "sum of mismatches"-method from Vingron and Argos <abbrgrp><abbr bid="B74">74</abbr><abbr bid="B75">75</abbr></abbrgrp>. The central consideration is that the higher the similarity of a sequence is to all remaining sequences in the alignment, the lower its weight <b><it>w </it></b>should be. Here, the number of identical residues between two sequences <b><it>k </it></b>and <b><it>i </it></b>is chosen as a distance measure. For each sequence <b><it>k</it></b>, the weight <b><it>w</it></b><sub><b><it>k </it></b></sub>is calculated using Kronecker's delta according to equation 1. The value <it>&#947; </it>is obtained from the normalization to &#8721;<b><it>w<sub>k </sub></it></b>= <b><it>n<sub>seq</sub></it></b>. The use of a modified version of the original Vingron and Argos method is due to the disproportionally high weights that the original method assigns to sequences with many non-amino acid positions such as ones which are outside of the sequence for sites close to either the N- or C-termini.</p>
            <p>
               <m:math name="1745-6150-2-1-i2" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>w</m:mi>
                           <m:mi>k</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mi>&#947;</m:mi>
                        <m:mfrac>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>n</m:mi>
                                 <m:mrow>
                                    <m:mi>p</m:mi>
                                    <m:mi>o</m:mi>
                                    <m:mi>s</m:mi>
                                 </m:mrow>
                              </m:msub>
                           </m:mrow>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>l</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>n</m:mi>
                                          <m:mrow>
                                             <m:mi>p</m:mi>
                                             <m:mi>o</m:mi>
                                             <m:mi>s</m:mi>
                                          </m:mrow>
                                       </m:msub>
                                    </m:mrow>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:mstyle displaystyle="true">
                                       <m:munderover>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mrow>
                                             <m:mi>i</m:mi>
                                             <m:mo>=</m:mo>
                                             <m:mn>1</m:mn>
                                          </m:mrow>
                                          <m:mrow>
                                             <m:msub>
                                                <m:mi>n</m:mi>
                                                <m:mrow>
                                                   <m:mi>s</m:mi>
                                                   <m:mi>e</m:mi>
                                                   <m:mi>q</m:mi>
                                                </m:mrow>
                                             </m:msub>
                                          </m:mrow>
                                       </m:munderover>
                                       <m:mrow>
                                          <m:mi>&#948;</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>l</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:mo>,</m:mo>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>i</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>l</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWF3bWDdaWgaaWcbaGae83AaSgabeaakiabg2da9GGaciab+n7aNnaalaaabaGae8NBa42aaSbaaSqaaiab=bhaWjab=9gaVjab=nhaZbqabaaakeaadaaeWbqaamaaqahabaGae4hTdqMaeiikaGIae8xyaeMaeiikaGIae83AaSMaeiilaWIae8hBaWMaeiykaKIaeiilaWIae8xyaeMaeiikaGIae8xAaKMaeiilaWIae8hBaWMaeiykaKIaeiykaKcaleaacqWFPbqAcqGH9aqpcqaIXaqmaeaacqWFUbGBdaWgaaadbaGae83CamNae8xzauMae8xCaehabeaaa0GaeyyeIuoaaSqaaiab=XgaSjabg2da9iabigdaXaqaaiab=5gaUnaaBaaameaacqWFWbaCcqWFVbWBcqWFZbWCaeqaaaqdcqGHris5aaaakiaaxMaacaWLjaWaaeWaaeaacqaIXaqmaiaawIcacaGLPaaaaaa@63F3@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
         </sec>
         <sec>
            <st>
               <p>Sequence analysis part 2. Derivation of physical property characteristics</p>
            </st>
            <p>To assess physical and chemical requirements at specific motif positions, we make use of 20-dimensional property vectors <b><it>v </it></b>which assign characteristic values <it>v</it><sub><it>a </it></sub>to each amino acid <it>a</it>. These values have been measured in various experimental setups and quantify amino acid properties such as e.g. hydrophobicity, volume or charge. We used a pre-compiled property database <abbrgrp><abbr bid="B23">23</abbr><abbr bid="B28">28</abbr></abbrgrp> for the motif analysis. Here, single property vectors are typically specified by short identifiers such as EISD840101. We use only properties which have defined values for all 20 amino acids.</p>
            <p>One means of detecting amino acid requirements for a sequence position <b><it>l </it></b>is to compare property mean values <m:math name="1745-6150-2-1-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeaaaa@2E41@</m:annotation></m:semantics></m:math>(<it>l</it>) with expected mean values <m:math name="1745-6150-2-1-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeaaaa@2E41@</m:annotation></m:semantics></m:math><sub><it>DB </it></sub>from biological databases. Significances can be assessed using Student's <it>t</it>-distribution with the help of the property value dispersion &#963;(<it>l</it>). The values <b><it>p</it></b><sub><b><it>a </it></b></sub>represent the database occurrences of amino acid type <b><it>a </it></b>calculated on the basis of UNIREF.</p>
            <p>
               <m:math name="1745-6150-2-1-i4" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mover accent="true">
                           <m:mi>v</m:mi>
                           <m:mo>&#175;</m:mo>
                        </m:mover>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>l</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>k</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>w</m:mi>
                                       <m:mi>k</m:mi>
                                    </m:msub>
                                    <m:msub>
                                       <m:mi>v</m:mi>
                                       <m:mrow>
                                          <m:mi>a</m:mi>
                                          <m:mo stretchy="false">(</m:mo>
                                          <m:mi>k</m:mi>
                                          <m:mo>,</m:mo>
                                          <m:mi>l</m:mi>
                                          <m:mo stretchy="false">)</m:mo>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>k</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>w</m:mi>
                                       <m:mi>k</m:mi>
                                    </m:msub>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>2</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeaiabcIcaOiab=XgaSjabcMcaPiabg2da9maalaaabaWaaabuaeaacqWF3bWDdaWgaaWcbaGae83AaSgabeaakiab=zha2naaBaaaleaacqWFHbqycqGGOaakcqWFRbWAcqGGSaalcqWFSbaBcqGGPaqkaeqaaaqaaiab=TgaRbqab0GaeyyeIuoaaOqaamaaqafabaGae83DaC3aaSbaaSqaaiab=TgaRbqabaaabaGae83AaSgabeqdcqGHris5aaaakiaaxMaacaWLjaWaaeWaaeaacqaIYaGmaiaawIcacaGLPaaaaaa@4B4B@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>
               <m:math name="1745-6150-2-1-i5" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mover accent="true">
                              <m:mi>v</m:mi>
                              <m:mo>&#175;</m:mo>
                           </m:mover>
                           <m:mrow>
                              <m:mi>D</m:mi>
                              <m:mi>B</m:mi>
                           </m:mrow>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>a</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>v</m:mi>
                                       <m:mi>a</m:mi>
                                    </m:msub>
                                    <m:msub>
                                       <m:mi>p</m:mi>
                                       <m:mi>a</m:mi>
                                    </m:msub>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>a</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>p</m:mi>
                                       <m:mi>a</m:mi>
                                    </m:msub>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>3</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeamaaBaaaleaacqWFebarcqWFcbGqaeqaaOGaeyypa0ZaaSaaaeaadaaeqbqaaiab=zha2naaBaaaleaacqWFHbqyaeqaaOGae8hCaa3aaSbaaSqaaiab=fgaHbqabaaabaGae8xyaegabeqdcqGHris5aaGcbaWaaabuaeaacqWFWbaCdaWgaaWcbaGae8xyaegabeaaaeaacqWFHbqyaeqaniabggHiLdaaaOGaaCzcaiaaxMaadaqadaqaaiabiodaZaGaayjkaiaawMcaaaaa@44D4@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>
               <m:math name="1745-6150-2-1-i6" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>&#963;</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>l</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:msqrt>
                           <m:mrow>
                              <m:mfrac>
                                 <m:mrow>
                                    <m:mstyle displaystyle="true">
                                       <m:munder>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mi>k</m:mi>
                                       </m:munder>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>w</m:mi>
                                             <m:mi>k</m:mi>
                                          </m:msub>
                                          <m:msup>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>(</m:mo>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>v</m:mi>
                                                         <m:mrow>
                                                            <m:mi>a</m:mi>
                                                            <m:mo stretchy="false">(</m:mo>
                                                            <m:mi>k</m:mi>
                                                            <m:mo>,</m:mo>
                                                            <m:mi>l</m:mi>
                                                            <m:mo stretchy="false">)</m:mo>
                                                         </m:mrow>
                                                      </m:msub>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:mover accent="true">
                                                         <m:mi>v</m:mi>
                                                         <m:mo>&#175;</m:mo>
                                                      </m:mover>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>l</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                   <m:mo>)</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mn>2</m:mn>
                                          </m:msup>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                                 <m:mrow>
                                    <m:mstyle displaystyle="true">
                                       <m:munder>
                                          <m:mo>&#8721;</m:mo>
                                          <m:mi>k</m:mi>
                                       </m:munder>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>w</m:mi>
                                             <m:mi>k</m:mi>
                                          </m:msub>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mn>1</m:mn>
                                       </m:mrow>
                                    </m:mstyle>
                                 </m:mrow>
                              </m:mfrac>
                           </m:mrow>
                        </m:msqrt>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>4</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWFdpWCcqGGOaakieWacqGFSbaBcqGGPaqkcqGH9aqpdaGcaaqaamaalaaabaWaaabuaeaacqGF3bWDdaWgaaWcbaGae43AaSgabeaakmaabmaabaGae4NDay3aaSbaaSqaaiab+fgaHjabcIcaOiab+TgaRjabcYcaSiab+XgaSjabcMcaPaqabaGccqGHsislcuGF2bGDgaqeaiabcIcaOiab+XgaSjabcMcaPaGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaaqaaiab+TgaRbqab0GaeyyeIuoaaOqaamaaqafabaGae43DaC3aaSbaaSqaaiab+TgaRbqabaGccqGHsislcqaIXaqmaSqaaiab+TgaRbqab0GaeyyeIuoaaaaaleqaaOGaaCzcaiaaxMaadaqadaqaaiabisda0aGaayjkaiaawMcaaaaa@55C7@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>A more sensitive method involves the calculation of the correlation coefficient <it>R(l) </it>between the property values <it>v</it><sub><it>a </it></sub>and the observed amino acid counts <it>c</it>(<it>a, l</it>) at an alignment position <it>l</it>. The underlying consideration is that amino acids with high values <it>v</it><sub><it>a </it></sub>of a required property should occur more frequently than residues with hindering, low property values (or vice versa).</p>
            <p>
               <m:math name="1745-6150-2-1-i7" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>R</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>l</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mn>20</m:mn>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>a</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>v</m:mi>
                                       <m:mi>a</m:mi>
                                    </m:msub>
                                    <m:mi>c</m:mi>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>a</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>l</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                 </m:mrow>
                              </m:mstyle>
                              <m:mo>&#8722;</m:mo>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>a</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>v</m:mi>
                                       <m:mi>a</m:mi>
                                    </m:msub>
                                 </m:mrow>
                              </m:mstyle>
                              <m:mstyle displaystyle="true">
                                 <m:munder>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mi>a</m:mi>
                                 </m:munder>
                                 <m:mrow>
                                    <m:mi>c</m:mi>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:mi>a</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>l</m:mi>
                                    <m:mo stretchy="false">)</m:mo>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mrow>
                              <m:msqrt>
                                 <m:mrow>
                                    <m:mrow>
                                       <m:mo>(</m:mo>
                                       <m:mrow>
                                          <m:mn>20</m:mn>
                                          <m:mstyle displaystyle="true">
                                             <m:munder>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mi>a</m:mi>
                                             </m:munder>
                                             <m:mrow>
                                                <m:msubsup>
                                                   <m:mi>v</m:mi>
                                                   <m:mi>a</m:mi>
                                                   <m:mn>2</m:mn>
                                                </m:msubsup>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>&#8722;</m:mo>
                                          <m:msup>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>(</m:mo>
                                                   <m:mrow>
                                                      <m:mstyle displaystyle="true">
                                                         <m:munder>
                                                            <m:mo>&#8721;</m:mo>
                                                            <m:mi>a</m:mi>
                                                         </m:munder>
                                                         <m:mrow>
                                                            <m:msub>
                                                               <m:mi>v</m:mi>
                                                               <m:mi>a</m:mi>
                                                            </m:msub>
                                                         </m:mrow>
                                                      </m:mstyle>
                                                   </m:mrow>
                                                   <m:mo>)</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mn>2</m:mn>
                                          </m:msup>
                                       </m:mrow>
                                       <m:mo>)</m:mo>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mo>(</m:mo>
                                       <m:mrow>
                                          <m:mn>20</m:mn>
                                          <m:mstyle displaystyle="true">
                                             <m:munder>
                                                <m:mo>&#8721;</m:mo>
                                                <m:mi>a</m:mi>
                                             </m:munder>
                                             <m:mrow>
                                                <m:mi>c</m:mi>
                                                <m:msup>
                                                   <m:mrow>
                                                      <m:mo stretchy="false">(</m:mo>
                                                      <m:mi>a</m:mi>
                                                      <m:mo>,</m:mo>
                                                      <m:mi>l</m:mi>
                                                      <m:mo stretchy="false">)</m:mo>
                                                   </m:mrow>
                                                   <m:mn>2</m:mn>
                                                </m:msup>
                                             </m:mrow>
                                          </m:mstyle>
                                          <m:mo>&#8722;</m:mo>
                                          <m:msup>
                                             <m:mrow>
                                                <m:mrow>
                                                   <m:mo>(</m:mo>
                                                   <m:mrow>
                                                      <m:mstyle displaystyle="true">
                                                         <m:munder>
                                                            <m:mo>&#8721;</m:mo>
                                                            <m:mi>a</m:mi>
                                                         </m:munder>
                                                         <m:mrow>
                                                            <m:mi>c</m:mi>
                                                            <m:mo stretchy="false">(</m:mo>
                                                            <m:mi>a</m:mi>
                                                            <m:mo>,</m:mo>
                                                            <m:mi>l</m:mi>
                                                            <m:mo stretchy="false">)</m:mo>
                                                         </m:mrow>
                                                      </m:mstyle>
                                                   </m:mrow>
                                                   <m:mo>)</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mn>2</m:mn>
                                          </m:msup>
                                       </m:mrow>
                                       <m:mo>)</m:mo>
                                    </m:mrow>
                                 </m:mrow>
                              </m:msqrt>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>5</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFsbGucqGGOaakcqWFSbaBcqGGPaqkcqGH9aqpdaWcaaqaaiabikdaYiabicdaWmaaqafabaGae8NDay3aaSbaaSqaaiab=fgaHbqabaGccqWFJbWycqGGOaakcqWFHbqycqGGSaalcqWFSbaBcqGGPaqkaSqaaiab=fgaHbqab0GaeyyeIuoakiabgkHiTmaaqafabaGae8NDay3aaSbaaSqaaiab=fgaHbqabaaabaGae8xyaegabeqdcqGHris5aOWaaabuaeaacqWFJbWycqGGOaakcqWFHbqycqGGSaalcqWFSbaBcqGGPaqkaSqaaiab=fgaHbqab0GaeyyeIuoaaOqaamaakaaabaWaaeWaaeaacqaIYaGmcqaIWaamdaaeqbqaaiab=zha2naaDaaaleaacqWFHbqyaeaacqaIYaGmaaaabaGae8xyaegabeqdcqGHris5aOGaeyOeI0YaaeWaaeaadaaeqbqaaiab=zha2naaBaaaleaacqWFHbqyaeqaaaqaaiab=fgaHbqab0GaeyyeIuoaaOGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaaGccaGLOaGaayzkaaWaaeWaaeaacqaIYaGmcqaIWaamdaaeqbqaaiab=ngaJjabcIcaOiab=fgaHjabcYcaSiab=XgaSjabcMcaPmaaCaaaleqabaGaeGOmaidaaaqaaiab=fgaHbqab0GaeyyeIuoakiabgkHiTmaabmaabaWaaabuaeaacqWFJbWycqGGOaakcqWFHbqycqGGSaalcqWFSbaBcqGGPaqkaSqaaiab=fgaHbqab0GaeyyeIuoaaOGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaaGccaGLOaGaayzkaaaaleqaaaaakiaaxMaacaWLjaWaaeWaaeaacqaI1aqnaiaawIcacaGLPaaaaaa@867D@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The statistical significance of <it>R </it>(<it>l</it>) can be calculated using the decision criterion <abbrgrp><abbr bid="B76">76</abbr></abbrgrp>:</p>
            <p>
               <m:math name="1745-6150-2-1-i8" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>t</m:mi>
                           <m:mi>&#945;</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mi>R</m:mi>
                              <m:msqrt>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>n</m:mi>
                                       <m:mi>v</m:mi>
                                    </m:msub>
                                    <m:mo>&#8722;</m:mo>
                                    <m:mn>3</m:mn>
                                 </m:mrow>
                              </m:msqrt>
                           </m:mrow>
                           <m:mrow>
                              <m:msqrt>
                                 <m:mrow>
                                    <m:mn>1</m:mn>
                                    <m:mo>&#8722;</m:mo>
                                    <m:msup>
                                       <m:mi>R</m:mi>
                                       <m:mn>2</m:mn>
                                    </m:msup>
                                 </m:mrow>
                              </m:msqrt>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>6</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWF0baDdaWgaaWcbaacciGae4xSdegabeaakiabg2da9maalaaabaGae8Nuai1aaOaaaeaacqWFUbGBdaWgaaWcbaGae8NDayhabeaakiabgkHiTiabiodaZaWcbeaaaOqaamaakaaabaGaeGymaeJaeyOeI0Iae8Nuai1aaWbaaSqabeaacqaIYaGmaaaabeaaaaGccaWLjaGaaCzcamaabmaabaGaeGOnaydacaGLOaGaayzkaaaaaa@3F53@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p><it>t</it><sub><it>&#945; </it></sub>is the argument of the Student's distribution function for a one-sided criterion with the confidence level &#945;, and 3 stands for the number of conditions (two for the linear regression and one for the sum of all residue type frequencies being unity).</p>
            <p>We employ Fisher's test for the detection of inter-positional correlations. Here, the sum of the squared variances <it>s</it>(<it>l</it><sub><it>i</it></sub>) for all <it>n</it><sub><it>pos </it></sub>isolated positions <it>l</it><sub><it>i </it></sub>is compared to the squared variance <it>&#963;</it>(<it>l</it><sub><it>1</it></sub><it>,l</it><sub><it>2</it></sub><it>,...,l</it><sub><it>npos</it></sub>) of the combined positions <abbrgrp><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr><abbr bid="B24">24</abbr><abbr bid="B26">26</abbr></abbrgrp>:</p>
            <p>
               <m:math name="1745-6150-2-1-i9" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>F</m:mi>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>l</m:mi>
                                 <m:mn>1</m:mn>
                              </m:msub>
                              <m:mo>,</m:mo>
                              <m:msub>
                                 <m:mi>l</m:mi>
                                 <m:mn>2</m:mn>
                              </m:msub>
                              <m:mo>,</m:mo>
                              <m:mn>...</m:mn>
                              <m:mo>,</m:mo>
                              <m:msub>
                                 <m:mi>l</m:mi>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>n</m:mi>
                                       <m:mrow>
                                          <m:mi>p</m:mi>
                                          <m:mi>o</m:mi>
                                          <m:mi>s</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:msub>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:mstyle displaystyle="true">
                                 <m:munderover>
                                    <m:mo>&#8721;</m:mo>
                                    <m:mrow>
                                       <m:mi>i</m:mi>
                                       <m:mo>=</m:mo>
                                       <m:mn>1</m:mn>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:msub>
                                          <m:mi>n</m:mi>
                                          <m:mrow>
                                             <m:mi>o</m:mi>
                                             <m:mi>p</m:mi>
                                             <m:mi>s</m:mi>
                                          </m:mrow>
                                       </m:msub>
                                    </m:mrow>
                                 </m:munderover>
                                 <m:mrow>
                                    <m:mi>&#963;</m:mi>
                                    <m:msup>
                                       <m:mrow>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mi>i</m:mi>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                       <m:mn>2</m:mn>
                                    </m:msup>
                                 </m:mrow>
                              </m:mstyle>
                           </m:mrow>
                           <m:mrow>
                              <m:mi>&#963;</m:mi>
                              <m:msup>
                                 <m:mrow>
                                    <m:mrow>
                                       <m:mo>(</m:mo>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>l</m:mi>
                                             <m:mn>1</m:mn>
                                          </m:msub>
                                          <m:mo>,</m:mo>
                                          <m:msub>
                                             <m:mi>l</m:mi>
                                             <m:mn>2</m:mn>
                                          </m:msub>
                                          <m:mo>,</m:mo>
                                          <m:mn>...</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:msub>
                                             <m:mi>l</m:mi>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>n</m:mi>
                                                   <m:mrow>
                                                      <m:mi>p</m:mi>
                                                      <m:mi>o</m:mi>
                                                      <m:mi>s</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                       </m:mrow>
                                       <m:mo>)</m:mo>
                                    </m:mrow>
                                 </m:mrow>
                                 <m:mn>2</m:mn>
                              </m:msup>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>7</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFgbGrdaqadaqaaiab=XgaSnaaBaaaleaacqaIXaqmaeqaaOGaeiilaWIae8hBaW2aaSbaaSqaaiabikdaYaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaGaeyypa0ZaaSaaaeaadaaeWbqaaGGaciab+n8aZnaabmaabaGae8hBaW2aaSbaaSqaaiab=LgaPbqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiabikdaYaaaaeaacqWFPbqAcqGH9aqpcqaIXaqmaeaacqWFUbGBdaWgaaadbaGae83Ba8Mae8hCaaNae83Camhabeaaa0GaeyyeIuoaaOqaaiab+n8aZnaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqWFSbaBdaWgaaWcbaGaeGOmaidabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiabikdaYaaaaaGccaWLjaGaaCzcamaabmaabaGaeG4naCdacaGLOaGaayzkaaaaaa@6EB0@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The obtained F-value follows an F-distribution with <it>n</it><sub><it>seq </it></sub>&#8211; 1 degrees of freedom <abbrgrp><abbr bid="B76">76</abbr></abbrgrp>. For weighted sequences, <it>n</it><sub><it>seq </it></sub>needs to be replaced by the sum of the weights of all sequences that are included in <it>F</it>-value calculation.</p>
            <p>Mean values and standard deviations (equations 2, 3 and 4) as well as property correlations (equations 5 and 6) and F-tests (equation 7) have been routinely used in the derivation of the physical property pattern surrounding the phosphorylation sites (see first three sections of Results). In the Results, we often write <m:math name="1745-6150-2-1-i3" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeaaaa@2E41@</m:annotation></m:semantics></m:math>, <b>&#963;</b>, <b><it>R </it></b>and <b><it>F </it></b>without positional arguments when we describe the positions in the text.</p>
         </sec>
         <sec>
            <st>
               <p>Details of the prediction methodology</p>
            </st>
            <p>Each prediction produces a score S that is composed of a profile term <it>S</it><sub><it>profile </it></sub>and a physico-chemical penalty value <it>S</it><sub><it>ppt</it></sub>.</p>
            <p><b><it>S </it></b>= <b><it>S</it></b><sub><b><it>profile </it></b></sub>+ <b><it>S</it></b><sub><b><it>ppt </it></b></sub>&#160;&#160;&#160; (8)</p>
            <p>The query sequence is predicted if the score is &#8805; than a predefined threshold <it>b</it>. We chose a threshold of <b><it>b </it></b>= 0 for the prediction of PKA-dependent phosphorylation and <b><it>b </it></b>= -0.5 for the twilight zone (see Results). The profile term is calculated using the PSIC algorithm <abbrgrp><abbr bid="B77">77</abbr></abbrgrp>, a method that provides sequence- and alignment position-specific weights, to remove redundancy from homologous sequences that originate from over-represented protein families. Note that the redundancy removal for the physical property calculation was carried out differently with a modification of the Vingron and Argos procedure <abbrgrp><abbr bid="B74">74</abbr><abbr bid="B75">75</abbr></abbrgrp> (see Sequence analysis part 1 in Methods). As different motif positions generally have different importance for substrate binding efficiency, the profile value contributions <it>S</it><sub><it>j</it></sub>(<it>a</it>(<it>l</it><sub><it>j</it></sub>)) of amino acids <it>a </it>at positions <it>l</it><sub><it>j </it></sub>are weighted using factors <it>&#945;</it><sub><b><it>profile, j </it></b></sub>For a profile that consists of <it>n</it><sub><it>pos </it></sub>positions, the term <it>S</it><sub><it>profile </it></sub>can be expressed as:</p>
            <p>
               <m:math name="1745-6150-2-1-i10" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mrow>
                              <m:mi>p</m:mi>
                              <m:mi>r</m:mi>
                              <m:mi>o</m:mi>
                              <m:mi>f</m:mi>
                              <m:mi>i</m:mi>
                              <m:mi>l</m:mi>
                              <m:mi>e</m:mi>
                           </m:mrow>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munderover>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>j</m:mi>
                                 <m:mo>=</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>n</m:mi>
                                    <m:mrow>
                                       <m:mi>p</m:mi>
                                       <m:mi>o</m:mi>
                                       <m:mi>s</m:mi>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                           </m:munderover>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>&#945;</m:mi>
                                 <m:mrow>
                                    <m:mi>p</m:mi>
                                    <m:mi>r</m:mi>
                                    <m:mi>o</m:mi>
                                    <m:mi>f</m:mi>
                                    <m:mi>i</m:mi>
                                    <m:mi>l</m:mi>
                                    <m:mi>e</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>j</m:mi>
                                 </m:mrow>
                              </m:msub>
                           </m:mrow>
                        </m:mstyle>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mi>j</m:mi>
                        </m:msub>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mi>a</m:mi>
                              <m:mrow>
                                 <m:mo>(</m:mo>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>l</m:mi>
                                       <m:mi>j</m:mi>
                                    </m:msub>
                                 </m:mrow>
                                 <m:mo>)</m:mo>
                              </m:mrow>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>9</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFtbWudaWgaaWcbaGae8hCaaNae8NCaiNae83Ba8Mae8NzayMae8xAaKMae8hBaWMae8xzaugabeaakiabg2da9maaqahabaacciGae4xSde2aaSbaaSqaaiab=bhaWjab=jhaYjab=9gaVjab=zgaMjab=LgaPjab=XgaSjab=vgaLjabcYcaSiab=PgaQbqabaaabaGae8NAaOMaeyypa0JaeGymaedabaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaniabggHiLdGccqWFtbWudaWgaaWcbaGae8NAaOgabeaakmaabmaabaGae8xyae2aaeWaaeaacqWFSbaBdaWgaaWcbaGae8NAaOgabeaaaOGaayjkaiaawMcaaaGaayjkaiaawMcaaiaaxMaacaWLjaWaaeWaaeaacqaI5aqoaiaawIcacaGLPaaaaaa@5F50@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The total penalty <it>S</it><sub><it>ppt </it></sub>is simply the sum of all <it>n</it><sub><it>penalties </it></sub>penalty terms <it>T</it><sub><it>j</it></sub>, where each <it>T</it><sub><it>j </it></sub>reflects a piece of the acquired knowledge about substrate binding requirements in the motif region. The height of the penalty can be adjusted using the corresponding weight factor <it>&#945;</it><sub><b><it>ppt, j</it></b></sub>.</p>
            <p>
               <m:math name="1745-6150-2-1-i11" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mrow>
                              <m:mi>p</m:mi>
                              <m:mi>p</m:mi>
                              <m:mi>t</m:mi>
                           </m:mrow>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munderover>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>j</m:mi>
                                 <m:mo>=</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>n</m:mi>
                                    <m:mrow>
                                       <m:mi>p</m:mi>
                                       <m:mi>e</m:mi>
                                       <m:mi>n</m:mi>
                                       <m:mi>a</m:mi>
                                       <m:mi>l</m:mi>
                                       <m:mi>t</m:mi>
                                       <m:mi>i</m:mi>
                                       <m:mi>e</m:mi>
                                       <m:mi>s</m:mi>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                           </m:munderover>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>&#945;</m:mi>
                                 <m:mrow>
                                    <m:mi>p</m:mi>
                                    <m:mi>p</m:mi>
                                    <m:mi>t</m:mi>
                                    <m:mo>,</m:mo>
                                    <m:mi>j</m:mi>
                                 </m:mrow>
                              </m:msub>
                           </m:mrow>
                        </m:mstyle>
                        <m:msub>
                           <m:mi>T</m:mi>
                           <m:mi>j</m:mi>
                        </m:msub>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mn>10</m:mn>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFtbWudaWgaaWcbaGae8hCaaNae8hCaaNae8hDaqhabeaakiabg2da9maaqahabaacciGae4xSde2aaSbaaSqaaiab=bhaWjab=bhaWjab=rha0jabcYcaSiab=PgaQbqabaaabaGae8NAaOMaeyypa0JaeGymaedabaGae8NBa42aaSbaaWqaaiab=bhaWjab=vgaLjab=5gaUjab=fgaHjab=XgaSjab=rha0jab=LgaPjab=vgaLjab=nhaZbqabaaaniabggHiLdGccqWFubavdaWgaaWcbaGae8NAaOgabeaakiaaxMaacaWLjaWaaeWaaeaacqaIXaqmcqaIWaamaiaawIcacaGLPaaaaaa@5653@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>Each term <it>T</it><sub><it>j </it></sub>has an associated property <b>v</b><sub>j </sub>that represents the type of physico-chemical requirement, e.g. hydrophobicity or size. Its mean value in the query sequence <m:math name="1745-6150-2-1-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:mi>j</m:mi></m:msub><m:mo stretchy="false">(</m:mo><m:msub><m:mi>l</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>l</m:mi><m:mrow><m:msub><m:mi>n</m:mi><m:mrow><m:mi>p</m:mi><m:mi>o</m:mi><m:mi>s</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeamaaBaaaleaacqWFQbGAaeqaaOGaeiikaGIae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaOGaeiykaKcaaa@3FD0@</m:annotation></m:semantics></m:math> over a set of <it>n</it><sub><it>pos </it></sub>motif positions <it>l</it><sub><it>i</it></sub>, together with the respective mean value over the learning set <m:math name="1745-6150-2-1-i13" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:mrow><m:msub><m:mi>j</m:mi><m:mrow><m:mi>L</m:mi><m:mi>S</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">(</m:mo><m:msub><m:mi>l</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>l</m:mi><m:mrow><m:msub><m:mi>n</m:mi><m:mrow><m:mi>p</m:mi><m:mi>o</m:mi><m:mi>s</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeamaaBaaaleaacqWFQbGAdaWgaaadbaGae8htaWKae83uamfabeaaaSqabaGccqGGOaakcqWFSbaBdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaGccqGGPaqkaaa@4250@</m:annotation></m:semantics></m:math>, the learning set dispersion <m:math name="1745-6150-2-1-i14" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mi>&#963;</m:mi><m:mrow><m:msub><m:mi>j</m:mi><m:mrow><m:mi>L</m:mi><m:mi>S</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">(</m:mo><m:msub><m:mi>l</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>l</m:mi><m:mrow><m:msub><m:mi>n</m:mi><m:mrow><m:mi>p</m:mi><m:mi>o</m:mi><m:mi>s</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWFdpWCdaWgaaWcbaacbmGae4NAaO2aaSbaaWqaaiab+Xeamjab+nfatbqabaaaleqaaOGaeiikaGIae4hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqGFSbaBdaWgaaWcbaGae4NBa42aaSbaaWqaaiab+bhaWjab+9gaVjab+nhaZbqabaaaleqaaOGaeiykaKcaaa@4288@</m:annotation></m:semantics></m:math> and a freely selectable parameter <it>t</it><sub><it>j </it></sub>are used as a basis for calculation of <it>T</it><sub><it>j</it></sub>.</p>
            <p>We use two different types of penalties: (i) fixed ones and (ii) gauss-type penalties. Fixed penalties are simple penalties that are applied if the property mean value <m:math name="1745-6150-2-1-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:mi>j</m:mi></m:msub><m:mo stretchy="false">(</m:mo><m:msub><m:mi>l</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>l</m:mi><m:mrow><m:msub><m:mi>n</m:mi><m:mrow><m:mi>p</m:mi><m:mi>o</m:mi><m:mi>s</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeamaaBaaaleaacqWFQbGAaeqaaOGaeiikaGIae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaOGaeiykaKcaaa@3FD0@</m:annotation></m:semantics></m:math> in the query sequence exceeds the predefined threshold <it>t</it><sub><it>j</it></sub>, without taking the learning set into account. <it>T</it><sub><it>j </it></sub>is then either 0 or -1.</p>
            <p>Whereas fixed <it>T</it><sub><it>j </it></sub>only penalize the mere occurrence of potentially hindering amino acids, Gaussian-type penalties also take into account the level of deviation from property preferences at motif positions. To exclude sequences that strongly deviate from the derived consensus, the value of <it>T</it><sub><it>j </it></sub>increases with the square of the difference between <m:math name="1745-6150-2-1-i12" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msub><m:mover accent="true"><m:mi>v</m:mi><m:mo>&#175;</m:mo></m:mover><m:mi>j</m:mi></m:msub><m:mo stretchy="false">(</m:mo><m:msub><m:mi>l</m:mi><m:mn>1</m:mn></m:msub><m:mo>,</m:mo><m:mn>...</m:mn><m:mo>,</m:mo><m:msub><m:mi>l</m:mi><m:mrow><m:msub><m:mi>n</m:mi><m:mrow><m:mi>p</m:mi><m:mi>o</m:mi><m:mi>s</m:mi></m:mrow></m:msub></m:mrow></m:msub><m:mo stretchy="false">)</m:mo></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacuWF2bGDgaqeamaaBaaaleaacqWFQbGAaeqaaOGaeiikaGIae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaOGaeiykaKcaaa@3FD0@</m:annotation></m:semantics></m:math> and the learning set mean value:</p>
            <p>
               <m:math name="1745-6150-2-1-i15" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>T</m:mi>
                           <m:mi>j</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mo>&#8722;</m:mo>
                        <m:mi>&#934;</m:mi>
                        <m:mfrac>
                           <m:mrow>
                              <m:msup>
                                 <m:mrow>
                                    <m:mrow>
                                       <m:mo>(</m:mo>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8722;</m:mo>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>j</m:mi>
                                                   <m:mrow>
                                                      <m:mi>L</m:mi>
                                                      <m:mi>S</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                       <m:mo>)</m:mo>
                                    </m:mrow>
                                 </m:mrow>
                                 <m:mn>2</m:mn>
                              </m:msup>
                           </m:mrow>
                           <m:mrow>
                              <m:mn>2</m:mn>
                              <m:msub>
                                 <m:mi>&#963;</m:mi>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>j</m:mi>
                                       <m:mrow>
                                          <m:mi>L</m:mi>
                                          <m:mi>S</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                 </m:mrow>
                              </m:msub>
                              <m:msup>
                                 <m:mrow>
                                    <m:mrow>
                                       <m:mo>(</m:mo>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mi>l</m:mi>
                                             <m:mn>1</m:mn>
                                          </m:msub>
                                          <m:mo>,</m:mo>
                                          <m:mn>...</m:mn>
                                          <m:mo>,</m:mo>
                                          <m:msub>
                                             <m:mi>l</m:mi>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>n</m:mi>
                                                   <m:mrow>
                                                      <m:mi>p</m:mi>
                                                      <m:mi>o</m:mi>
                                                      <m:mi>s</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                       </m:mrow>
                                       <m:mo>)</m:mo>
                                    </m:mrow>
                                 </m:mrow>
                                 <m:mn>2</m:mn>
                              </m:msup>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mn>11</m:mn>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFubavdaWgaaWcbaGae8NAaOgabeaakiabg2da9iabgkHiTiabfA6agnaalaaabaWaaeWaaeaacuWF2bGDgaqeamaaBaaaleaacqWFQbGAaeqaaOWaaeWaaeaacqWFSbaBdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaaakiaawIcacaGLPaaacqGHsislcuWF2bGDgaqeamaaBaaaleaacqWFQbGAdaWgaaadbaGae8htaWKae83uamfabeaaaSqabaGcdaqadaqaaiab=XgaSnaaBaaaleaacqaIXaqmaeqaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIae8hBaW2aaSbaaSqaaiab=5gaUnaaBaaameaacqWFWbaCcqWFVbWBcqWFZbWCaeqaaaWcbeaaaOGaayjkaiaawMcaaaGaayjkaiaawMcaamaaCaaaleqabaGaeGOmaidaaaGcbaGaeGOmaidcciGae43Wdm3aaSbaaSqaaiab=PgaQnaaBaaameaacqWFmbatcqWFtbWuaeqaaaWcbeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqaIYaGmaaaaaOGaaCzcaiaaxMaadaqadaqaaiabigdaXiabigdaXaGaayjkaiaawMcaaaaa@7B65@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The criterion &#934;<sub><it>j </it></sub>in equation 11 determines whether a penalty is applied or not. Depending on whether small or great property values should be penalized, &#934;<sub><it>j </it></sub>can be expressed using the two equations below. Here, the potentially freely selectable parameter <b><it>t</it></b><sub><it>j </it></sub>&#8805; 0 can be used to change the stringency of the threshold.</p>
            <p>
               <m:math name="1745-6150-2-1-i16" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>&#934;</m:mi>
                           <m:mi>j</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mrow>
                           <m:mo>{</m:mo>
                           <m:mrow>
                              <m:mtable>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mn>1</m:mn>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mi>f</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>&lt;</m:mo>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>j</m:mi>
                                                   <m:mrow>
                                                      <m:mi>L</m:mi>
                                                      <m:mi>S</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8722;</m:mo>
                                          <m:msub>
                                             <m:mi>t</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:msub>
                                             <m:mi>&#963;</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mn>0</m:mn>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mi>f</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8805;</m:mo>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>j</m:mi>
                                                   <m:mrow>
                                                      <m:mi>L</m:mi>
                                                      <m:mi>S</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8722;</m:mo>
                                          <m:msub>
                                             <m:mi>t</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:msub>
                                             <m:mi>&#963;</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                              <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                              <m:mrow>
                                 <m:mo>(</m:mo>
                                 <m:mrow>
                                    <m:mn>12</m:mn>
                                 </m:mrow>
                                 <m:mo>)</m:mo>
                              </m:mrow>
                           </m:mrow>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHMoGrdaWgaaWcbaacbmGae8NAaOgabeaakiabg2da9maaceqabaqbaeqabiWaaaqaaiabigdaXaqaaiab=LgaPjab=zgaMbqaaiqb=zha2zaaraWaaSbaaSqaaiab=PgaQbqabaGcdaqadaqaaiab=XgaSnaaBaaaleaacqaIXaqmaeqaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIae8hBaW2aaSbaaSqaaiab=5gaUnaaBaaameaacqWFWbaCcqWFVbWBcqWFZbWCaeqaaaWcbeaaaOGaayjkaiaawMcaaiabgYda8iqb=zha2zaaraWaaSbaaSqaaiab=PgaQnaaBaaameaacqWFmbatcqWFtbWuaeqaaaWcbeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaGaeyOeI0Iae8hDaq3aaSbaaSqaaiab=PgaQbqabaacciGccqGFdpWCdaWgaaWcbaGae8NAaOgabeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaaabaGaeGimaadabaGae8xAaKMae8NzaygabaGaf8NDayNbaebadaWgaaWcbaGae8NAaOgabeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaGaeyyzImRaf8NDayNbaebadaWgaaWcbaGae8NAaO2aaSbaaWqaaiab=Xeamjab=nfatbqabaaaleqaaOWaaeWaaeaacqWFSbaBdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaaakiaawIcacaGLPaaacqGHsislcqWF0baDdaWgaaWcbaGae8NAaOgabeaakiab+n8aZnaaBaaaleaacqWFQbGAaeqaaOWaaeWaaeaacqWFSbaBdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaaakiaawIcacaGLPaaaaaGaaCzcaiaaxMaadaqadaqaaiabigdaXiabikdaYaGaayjkaiaawMcaaaGaay5Eaaaaaa@BF7C@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>
               <m:math name="1745-6150-2-1-i17" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>&#934;</m:mi>
                           <m:mi>j</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mrow>
                           <m:mo>{</m:mo>
                           <m:mrow>
                              <m:mtable>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mn>1</m:mn>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mi>f</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>></m:mo>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>j</m:mi>
                                                   <m:mrow>
                                                      <m:mi>L</m:mi>
                                                      <m:mi>S</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>+</m:mo>
                                          <m:msub>
                                             <m:mi>t</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:msub>
                                             <m:mi>&#963;</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                                 <m:mtr>
                                    <m:mtd>
                                       <m:mn>0</m:mn>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:mi>i</m:mi>
                                          <m:mi>f</m:mi>
                                       </m:mrow>
                                    </m:mtd>
                                    <m:mtd>
                                       <m:mrow>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8804;</m:mo>
                                          <m:msub>
                                             <m:mover accent="true">
                                                <m:mi>v</m:mi>
                                                <m:mo>&#175;</m:mo>
                                             </m:mover>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>j</m:mi>
                                                   <m:mrow>
                                                      <m:mi>L</m:mi>
                                                      <m:mi>S</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                          <m:mo>+</m:mo>
                                          <m:msub>
                                             <m:mi>t</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:msub>
                                             <m:mi>&#963;</m:mi>
                                             <m:mi>j</m:mi>
                                          </m:msub>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mn>1</m:mn>
                                                </m:msub>
                                                <m:mo>,</m:mo>
                                                <m:mn>...</m:mn>
                                                <m:mo>,</m:mo>
                                                <m:msub>
                                                   <m:mi>l</m:mi>
                                                   <m:mrow>
                                                      <m:msub>
                                                         <m:mi>n</m:mi>
                                                         <m:mrow>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>o</m:mi>
                                                            <m:mi>s</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                    </m:mtd>
                                 </m:mtr>
                              </m:mtable>
                              <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                              <m:mrow>
                                 <m:mo>(</m:mo>
                                 <m:mrow>
                                    <m:mn>13</m:mn>
                                 </m:mrow>
                                 <m:mo>)</m:mo>
                              </m:mrow>
                           </m:mrow>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqHMoGrdaWgaaWcbaacbmGae8NAaOgabeaakiabg2da9maaceqabaqbaeqabiWaaaqaaiabigdaXaqaaiab=LgaPjab=zgaMbqaaiqb=zha2zaaraWaaSbaaSqaaiab=PgaQbqabaGcdaqadaqaaiab=XgaSnaaBaaaleaacqaIXaqmaeqaaOGaeiilaWIaeiOla4IaeiOla4IaeiOla4IaeiilaWIae8hBaW2aaSbaaSqaaiab=5gaUnaaBaaameaacqWFWbaCcqWFVbWBcqWFZbWCaeqaaaWcbeaaaOGaayjkaiaawMcaaiab=5da+iqb=zha2zaaraWaaSbaaSqaaiab=PgaQnaaBaaameaacqWFmbatcqWFtbWuaeqaaaWcbeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaGaey4kaSIae8hDaq3aaSbaaSqaaiab=PgaQbqabaacciGccqGFdpWCdaWgaaWcbaGae8NAaOgabeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaaabaGaeGimaadabaGae8xAaKMae8NzaygabaGaf8NDayNbaebadaWgaaWcbaGae8NAaOgabeaakmaabmaabaGae8hBaW2aaSbaaSqaaiabigdaXaqabaGccqGGSaalcqGGUaGlcqGGUaGlcqGGUaGlcqGGSaalcqWFSbaBdaWgaaWcbaGae8NBa42aaSbaaWqaaiab=bhaWjab=9gaVjab=nhaZbqabaaaleqaaaGccaGLOaGaayzkaaGaeyizImQaf8NDayNbaebadaWgaaWcbaGae8NAaO2aaSbaaWqaaiab=Xeamjab=nfatbqabaaaleqaaOWaaeWaaeaacqWFSbaBdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaaakiaawIcacaGLPaaacqGHRaWkcqWF0baDdaWgaaWcbaGae8NAaOgabeaakiab+n8aZnaaBaaaleaacqWFQbGAaeqaaOWaaeWaaeaacqWFSbaBdaWgaaWcbaGaeGymaedabeaakiabcYcaSiabc6caUiabc6caUiabc6caUiabcYcaSiab=XgaSnaaBaaaleaacqWFUbGBdaWgaaadbaGae8hCaaNae83Ba8Mae83CamhabeaaaSqabaaakiaawIcacaGLPaaaaaGaaCzcaiaaxMaadaqadaqaaiabigdaXiabiodaZaGaayjkaiaawMcaaaGaay5Eaaaaaa@BF54@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>In our concept, the value <b><it>t</it></b><sub><it>j </it></sub>is not thought to be an adjustable parameter depending on the learning set. Generally, the value <b><it>t</it></b><sub><it>j </it></sub>is set equal to zero for all Gaussian-type terms but needs to be determined for the fixed penalties to define the level of the threshold (see Table <tblr tid="T6">6</tblr>). We see the introduction of <b><it>t</it></b><sub><it>j </it></sub>as a way to achieve equal formal notation of Gaussian terms and fixed penalties.</p>
            <tbl id="T6">
               <title>
                  <p>Table 6</p>
               </title>
               <caption>
                  <p>Summary of the physical terms <it>T</it><sub><it>j </it></sub>in the scoring function of the pkaPS predictor.</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>
                              <it>j</it>
                           </sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>Property</p>
                     </c>
                     <c ca="center">
                        <p>Positions</p>
                     </c>
                     <c ca="center">
                        <p>&#945;<sub>ppt,j</sub></p>
                     </c>
                     <c ca="center">
                        <p>Description</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>1</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>(+) H, K, R</p>
                     </c>
                     <c ca="center">
                        <p>-3/-2</p>
                     </c>
                     <c ca="center">
                        <p>1.0</p>
                     </c>
                     <c ca="center">
                        <p>Positive charge</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>2</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>EISD860102 [29]</p>
                     </c>
                     <c ca="center">
                        <p>-3/-2</p>
                     </c>
                     <c ca="center">
                        <p>0.030</p>
                     </c>
                     <c ca="center">
                        <p>Hydrophilic residues</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>3</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>ZIMJ680104 [87]</p>
                     </c>
                     <c ca="center">
                        <p>-6 to -2</p>
                     </c>
                     <c ca="center">
                        <p>0.020</p>
                     </c>
                     <c ca="center">
                        <p>Isoelectric point (positive charge), long range</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>4</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>(+) H, K, R; (-) D, E</p>
                     </c>
                     <c ca="center">
                        <p>-6 to -2</p>
                     </c>
                     <c ca="center">
                        <p>0.48</p>
                     </c>
                     <c ca="center">
                        <p>Total charge, long range</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>5</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>GEIM800106 [34]</p>
                     </c>
                     <c ca="center">
                        <p>+1</p>
                     </c>
                     <c ca="center">
                        <p>0.070</p>
                     </c>
                     <c ca="center">
                        <p>&#946;-strand preference</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>6</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>GEIM800107 [34]</p>
                     </c>
                     <c ca="center">
                        <p>+1/+4</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>&#946;-strand preference, compensated</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>7</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>HAGECH94_V [88]</p>
                     </c>
                     <c ca="center">
                        <p>+2/+3</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>Size restrictions</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>8</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>KARP850101 [89]</p>
                     </c>
                     <c ca="center">
                        <p>+3</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>Flexibility</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>9</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>KARP850101 [89]</p>
                     </c>
                     <c ca="center">
                        <p>-9 to -4</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>Minimal linker &#8211; flexibility</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>10</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>EISD840101 [29]</p>
                     </c>
                     <c ca="center">
                        <p>-9 to -4</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>Minimal linker &#8211; hydrophilicity</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>11</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>EISD840101 [29]</p>
                     </c>
                     <c ca="center">
                        <p>+4 to +9</p>
                     </c>
                     <c ca="center">
                        <p>0.058</p>
                     </c>
                     <c ca="center">
                        <p>Minimal linker &#8211; hydrophilicity</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>12</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>KARP850101 [89]</p>
                     </c>
                     <c ca="center">
                        <p>+4 to +9</p>
                     </c>
                     <c ca="center">
                        <p>0.058</p>
                     </c>
                     <c ca="center">
                        <p>Minimal linker &#8211; flexibility</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>13</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>CIDH920105 [90]</p>
                     </c>
                     <c ca="center">
                        <p>-18 to -6, +6 to +23</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>Avoid buried regions &#8211; hydrophilicity</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>T</it>
                           <sub>14</sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>VINM940101 [30]</p>
                     </c>
                     <c ca="center">
                        <p>-18 to -6, +6 to +23</p>
                     </c>
                     <c ca="center">
                        <p>0.040</p>
                     </c>
                     <c ca="center">
                        <p>Avoid buried regions &#8211; flexibility</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The table shows the complete list of physical property terms in the score function. The values <b><it>t</it></b><sub><it>j </it></sub>in equations 12 and 13 are set equal to zero for Gaussian-type terms and equal to 0.1 for fixed penalties <b><it>T</it></b><sub>1 </sub>and <b><it>T</it></b><sub>4</sub>. The only adjustable parameter per term is the weight &#945;<sub>ppt,j </sub>(equation 10). These parameters have been selected so that <b><it>S</it></b><sub>ppt </sub>is close to zero for most of the learning set examples. The values <it>&#945;</it><sub><it>profile,j </it></sub>(equation 9) for the positions -6...+6 are the following multiples of a normalization factor 0.051: 7, 6, 3, 5, 6, 6, 6, 3, 2, 5, 3, 4, and 1. Initial guesses for the <it>&#945;</it><sub>ppt,j </sub>and <it>&#945;</it><sub>profile,j </sub>parameters have been calculated with linear kernel support vector machines as implemented in the LIBSVM library [91]. These weights have subsequently been rounded to two significant positions and edited manually to avoid non-positive numbers and to achieve close-to-zero <b><it>T</it></b><sub><it>j </it></sub>values for the learning set sequences.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Evaluation of predictor performance</p>
            </st>
            <p>The prediction outcomes of an algorithm can be grouped into the following four categories: "true-positives (<it>T</it><sub><it>P</it></sub>)" are correctly predicted queries that contain the analyzed feature; "false-negatives (<it>F</it><sub><it>N</it></sub>)" contain the feature but are predicted not to do so; "true-negatives (<it>T</it><sub><it>N</it></sub>)" are correctly predicted not to contain the feature; "false-positives (<it>F</it><sub><it>P</it></sub>)" do not contain the feature but are wrongly predicted to do so. The number of prediction results that fall into these categories are used to calculate measures for predictor performances. These are typically calculated in terms of sensitivity (<it>S</it><sub><it>n</it></sub>) and specificity (<it>S</it><sub><it>p</it></sub>). The former is defined as the proportion of positive sites that the method can identify, and the latter as the fraction of negative sites that is correctly classified <abbrgrp><abbr bid="B12">12</abbr><abbr bid="B78">78</abbr></abbrgrp>.</p>
            <p>
               <m:math name="1745-6150-2-1-i18" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mi>n</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mi>P</m:mi>
                              </m:msub>
                           </m:mrow>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mi>P</m:mi>
                              </m:msub>
                              <m:mo>+</m:mo>
                              <m:msub>
                                 <m:mi>F</m:mi>
                                 <m:mi>N</m:mi>
                              </m:msub>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mn>14</m:mn>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFtbWudaWgaaWcbaGae8NBa4gabeaakiabg2da9maalaaabaGae8hvaq1aaSbaaSqaaiab=bfaqbqabaaakeaacqWFubavdaWgaaWcbaGae8huaafabeaakiabgUcaRiab=zeagnaaBaaaleaacqWFobGtaeqaaaaakiaaxMaacaWLjaWaaeWaaeaacqaIXaqmcqaI0aanaiaawIcacaGLPaaaaaa@3D9D@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>
               <m:math name="1745-6150-2-1-i19" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mi>p</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mfrac>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mi>N</m:mi>
                              </m:msub>
                           </m:mrow>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>F</m:mi>
                                 <m:mi>P</m:mi>
                              </m:msub>
                              <m:mo>+</m:mo>
                              <m:msub>
                                 <m:mi>T</m:mi>
                                 <m:mi>N</m:mi>
                              </m:msub>
                           </m:mrow>
                        </m:mfrac>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mn>15</m:mn>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFtbWudaWgaaWcbaGae8hCaahabeaakiabg2da9maalaaabaGae8hvaq1aaSbaaSqaaiab=5eaobqabaaakeaacqWFgbGrdaWgaaWcbaGae8huaafabeaakiabgUcaRiab=rfaunaaBaaaleaacqWFobGtaeqaaaaakiaaxMaacaWLjaWaaeWaaeaacqaIXaqmcqaI1aqnaiaawIcacaGLPaaaaaa@3D9F@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>Alternatively, one can use the "false-negative" (<it>F</it><sub><it>n</it></sub>) and "false-positive" (<it>F</it><sub><it>p</it></sub>) rates. They express the opposite of sensitivity and specificity, namely the amount of wrongly classified sequences for each prediction class, and are equal to 1 minus the respective <it>S</it><sub><it>n </it></sub>or <it>S</it><sub><it>p </it></sub>values.</p>
         </sec>
         <sec>
            <st>
               <p>"On the fly" estimation of false-positive rates</p>
            </st>
            <p>To assess false-positive rates "on the fly" for obtained total scores <it>S</it>, a previously described estimation methodology <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B20">20</abbr><abbr bid="B79">79</abbr><abbr bid="B80">80</abbr></abbrgrp> is used that follows the spirit of BLAST <it>p</it>-values <abbrgrp><abbr bid="B81">81</abbr></abbrgrp>. This allows an easier interpretation of the total score <it>S </it>and provides the possibility for a better comparison with outputs from other prediction programs. The probability of false-positive prediction is approximated to the empirical distribution of sequences that are known not to carry the feature of interest. If a set of negative examples exists, it can be directly used for this task. If none is available, the function can be extrapolated from the distribution of low scores.</p>
            <p>The generalized analytical form of the extreme-value distribution that has successfully been applied in the MyPS <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> and big-&#928; predictors <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B79">79</abbr></abbrgrp> is used for this approximation task. The probability <it>P </it>of a score <it>S </it>to be larger than a threshold <it>S</it><sub><it>th </it></sub>is calculated using a polynomial <b><it>f </it></b>(<b><it>S</it></b><sub><b><it>th</it></b></sub>) of the score threshold <it>S</it><sub><it>th </it></sub>and can be described by:</p>
            <p>
               <m:math name="1745-6150-2-1-i20" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>P</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>S</m:mi>
                        <m:mo>></m:mo>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mrow>
                              <m:mi>t</m:mi>
                              <m:mi>h</m:mi>
                           </m:mrow>
                        </m:msub>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mn>1</m:mn>
                        <m:mo>&#8722;</m:mo>
                        <m:msup>
                           <m:mi>e</m:mi>
                           <m:mrow>
                              <m:mo>&#8722;</m:mo>
                              <m:msup>
                                 <m:mi>e</m:mi>
                                 <m:mrow>
                                    <m:mo>&#8722;</m:mo>
                                    <m:mi>f</m:mi>
                                    <m:mo stretchy="false">(</m:mo>
                                    <m:msub>
                                       <m:mi>S</m:mi>
                                       <m:mrow>
                                          <m:mi>t</m:mi>
                                          <m:mi>h</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo stretchy="false">)</m:mo>
                                 </m:mrow>
                              </m:msup>
                           </m:mrow>
                        </m:msup>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mn>16</m:mn>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFqbaucqGGOaakcqWFtbWucqGH+aGpcqWFtbWudaWgaaWcbaGae8hDaqNae8hAaGgabeaakiabcMcaPiabg2da9iabigdaXiabgkHiTiab=vgaLnaaCaaaleqabaGaeyOeI0Iae8xzau2aaWbaaWqabeaacqGHsislcqWFMbGzcqGGOaakcqWFtbWudaWgaaqaaiab=rha0jab=HgaObqabaGaeiykaKcaaaaakiaaxMaacaWLjaWaaeWaaeaacqaIXaqmcqaI2aGnaiaawIcacaGLPaaaaaa@496D@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where</p>
            <p>
               <m:math name="1745-6150-2-1-i21" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mi>f</m:mi>
                        <m:mo stretchy="false">(</m:mo>
                        <m:msub>
                           <m:mi>S</m:mi>
                           <m:mrow>
                              <m:mi>t</m:mi>
                              <m:mi>h</m:mi>
                           </m:mrow>
                        </m:msub>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munderover>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>i</m:mi>
                                 <m:mo>=</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mi>n</m:mi>
                           </m:munderover>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>&#955;</m:mi>
                                 <m:mi>i</m:mi>
                              </m:msub>
                           </m:mrow>
                        </m:mstyle>
                        <m:msup>
                           <m:mrow>
                              <m:mrow>
                                 <m:mo>(</m:mo>
                                 <m:mrow>
                                    <m:msub>
                                       <m:mi>S</m:mi>
                                       <m:mrow>
                                          <m:mi>t</m:mi>
                                          <m:mi>h</m:mi>
                                       </m:mrow>
                                    </m:msub>
                                    <m:mo>&#8722;</m:mo>
                                    <m:mi>u</m:mi>
                                 </m:mrow>
                                 <m:mo>)</m:mo>
                              </m:mrow>
                           </m:mrow>
                           <m:mi>i</m:mi>
                        </m:msup>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mrow>
                              <m:mn>17</m:mn>
                           </m:mrow>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFMbGzcqGGOaakcqWFtbWudaWgaaWcbaGae8hDaqNae8hAaGgabeaakiabcMcaPiabg2da9maaqahabaacciGae43UdW2aaSbaaSqaaiab=LgaPbqabaaabaGae8xAaKMaeyypa0JaeGymaedabaGae8NBa4ganiabggHiLdGcdaqadaqaaiab=nfatnaaBaaaleaacqWF0baDcqWFObaAaeqaaOGaeyOeI0Iae8xDauhacaGLOaGaayzkaaWaaWbaaSqabeaacqWFPbqAaaGccaWLjaGaaCzcamaabmaabaGaeGymaeJaeG4naCdacaGLOaGaayzkaaaaaa@4D5F@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>The qualities of the fits are evaluated with the residual <it>R</it><sub><it>n </it></sub>of the least-squares fit for all sequences <it>k </it>included in each fit evaluation (1 &#8804; <it>k </it>&#8804; <it>n</it><sub><it>seq</it></sub>; <it>n</it><sub><it>seq </it></sub>is the number of sequences included in fit evaluation, <it>S</it><sub><it>th</it>,<it>k </it></sub>is the total score for the <it>k</it><sup>th </sup>sequence):</p>
            <p>
               <m:math name="1745-6150-2-1-i22" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:msub>
                           <m:mi>R</m:mi>
                           <m:mi>n</m:mi>
                        </m:msub>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munderover>
                              <m:mo>&#8721;</m:mo>
                              <m:mrow>
                                 <m:mi>j</m:mi>
                                 <m:mo>=</m:mo>
                                 <m:mn>1</m:mn>
                              </m:mrow>
                              <m:mrow>
                                 <m:msub>
                                    <m:mi>n</m:mi>
                                    <m:mrow>
                                       <m:mi>s</m:mi>
                                       <m:mi>e</m:mi>
                                       <m:mi>q</m:mi>
                                    </m:mrow>
                                 </m:msub>
                              </m:mrow>
                           </m:munderover>
                           <m:mrow>
                              <m:msup>
                                 <m:mrow>
                                    <m:mrow>
                                       <m:mo>&#9001;</m:mo>
                                       <m:mrow>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>ln</m:mi>
                                          <m:mo>&#8289;</m:mo>
                                          <m:mrow>
                                             <m:mo>{</m:mo>
                                             <m:mrow>
                                                <m:mo>&#8722;</m:mo>
                                                <m:mi>ln</m:mi>
                                                <m:mo>&#8289;</m:mo>
                                                <m:mrow>
                                                   <m:mo>[</m:mo>
                                                   <m:mrow>
                                                      <m:mn>1</m:mn>
                                                      <m:mo>&#8722;</m:mo>
                                                      <m:msub>
                                                         <m:mi>P</m:mi>
                                                         <m:mrow>
                                                            <m:mi>e</m:mi>
                                                            <m:mi>m</m:mi>
                                                            <m:mi>p</m:mi>
                                                            <m:mi>i</m:mi>
                                                            <m:mi>r</m:mi>
                                                            <m:mi>i</m:mi>
                                                            <m:mi>c</m:mi>
                                                            <m:mi>a</m:mi>
                                                            <m:mi>l</m:mi>
                                                         </m:mrow>
                                                      </m:msub>
                                                      <m:mrow>
                                                         <m:mo>(</m:mo>
                                                         <m:mrow>
                                                            <m:mi>S</m:mi>
                                                            <m:mo>&lt;</m:mo>
                                                            <m:msub>
                                                               <m:mi>S</m:mi>
                                                               <m:mrow>
                                                                  <m:mi>t</m:mi>
                                                                  <m:mi>h</m:mi>
                                                                  <m:mo>,</m:mo>
                                                                  <m:mi>k</m:mi>
                                                               </m:mrow>
                                                            </m:msub>
                                                         </m:mrow>
                                                         <m:mo>)</m:mo>
                                                      </m:mrow>
                                                   </m:mrow>
                                                   <m:mo>]</m:mo>
                                                </m:mrow>
                                             </m:mrow>
                                             <m:mo>}</m:mo>
                                          </m:mrow>
                                          <m:mo>&#8722;</m:mo>
                                          <m:mi>f</m:mi>
                                          <m:mrow>
                                             <m:mo>(</m:mo>
                                             <m:mrow>
                                                <m:msub>
                                                   <m:mi>S</m:mi>
                                                   <m:mrow>
                                                      <m:mi>t</m:mi>
                                                      <m:mi>h</m:mi>
                                                      <m:mo>,</m:mo>
                                                      <m:mi>k</m:mi>
                                                   </m:mrow>
                                                </m:msub>
                                             </m:mrow>
                                             <m:mo>)</m:mo>
                                          </m:mrow>
                                       </m:mrow>
                                       <m:mo>&#9002;</m:mo>
                                    </m:mrow>
                                 </m:mrow>
                                 <m:mn>2</m:mn>
                              </m:msup>
                              <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                              <m:mrow>
                                 <m:mo>(</m:mo>
                                 <m:mrow>
                                    <m:mn>18</m:mn>
                                 </m:mrow>
                                 <m:mo>)</m:mo>
                              </m:mrow>
                           </m:mrow>
                        </m:mstyle>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaieWacqWFsbGudaWgaaWcbaGae8NBa4gabeaakiabg2da9maaqahabaWaaaWabeaacqGHsislcyGGSbaBcqGGUbGBdaGadeqaaiabgkHiTiGbcYgaSjabc6gaUnaadmaabaGaeGymaeJaeyOeI0Iae8huaa1aaSbaaSqaaiab=vgaLjab=1gaTjab=bhaWjab=LgaPjab=jhaYjab=LgaPjab=ngaJjab=fgaHjab=XgaSbqabaGcdaqadaqaaiab=nfatjabgYda8iab=nfatnaaBaaaleaacqWF0baDcqWFObaAcqGGSaalcqWFRbWAaeqaaaGccaGLOaGaayzkaaaacaGLBbGaayzxaaaacaGL7bGaayzFaaGaeyOeI0Iae8Nzay2aaeWaaeaacqWFtbWudaWgaaWcbaGae8hDaqNae8hAaGMaeiilaWIae83AaSgabeaaaOGaayjkaiaawMcaaaGaayzkJiaawQYiamaaCaaaleqabaGaeGOmaidaaOGaaCzcaiaaxMaadaqadaqaaiabigdaXiabiIda4aGaayjkaiaawMcaaaWcbaGae8NAaOMaeyypa0JaeGymaedabaGae8NBa42aaSbaaWqaaiab=nhaZjab=vgaLjab=fhaXbqabaaaniabggHiLdaaaa@7289@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>Approximations of the empirical distributions are calculated using iterative non-linear curve fitting implemented in the XMGRACE tool <abbrgrp><abbr bid="B82">82</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Predictor implementation</p>
            </st>
            <p>The predictor for protein kinase A (PKA) dependent phosphorylation, pkaPS, integrates the motif-related knowledge presented in the preceding sections. The profile term S<sub>profile </sub>is calculated using positions -6 to +6. The implemented terms <it>T</it><sub><it>j </it></sub>reflect the structure of the substrate motif as deduced from the available sequence, structural and kinetic data. The main determinants for substrate specificity are the residues that interact with the enzyme in its binding pocket, and the adjacent positions at the mouth of the cavity. Various terms analyze these amino acids and combinations thereof for deviations from the typical physico-chemical motif fingerprint. Another group of terms evaluates the quality of the linkers that flank this region. These must have a minimal length to ensure that the phosphorylation site and its adjacent positions are sufficiently separated from the core of the respective substrate protein. The last set of terms is calculated over a region that extends further than the minimal linker length. The purpose of these functions is to exclude hydrophobic domains that might fold to protein cores, and thereby become inaccessible for substrate recognition. A summary of these terms, including the utilized physico-chemical properties, the implicated positions and references to the underlying rationales is presented in Table <tblr tid="T6">6</tblr>.</p>
         </sec>
         <sec>
            <st>
               <p>False-positive prediction rate within sets of proven negative examples</p>
            </st>
            <p>The 1026 unphosphorylated serines/threonines that were collected in the course of learning set construction served as a basis to evaluate the false-positive prediction rate of the pkaPS tool. A program run over these sequences revealed that 6.5% of the included entries produced scores <it>S </it>&#8805; 0, and, thus, can be classified as false-positives. To assess the false-positive rate for any produced score <it>S </it>on the fly, an analytical score distribution was generated using the methodology presented above. Due to the availability of a real set of non-phosphorylated sequences, the analytical distribution could directly be approximated to the empirical score distribution of the set of negative examples (Figure <figr fid="F8">8</figr>).</p>
            <fig id="F8">
               <title>
                  <p>Figure 8</p>
               </title>
               <caption>
                  <p>Approximation of the empirical score distribution of non-phosphorylated sites</p>
               </caption>
               <text>
                  <p><b>Approximation of the empirical score distribution of non-phosphorylated sites</b>. The empirical score distribution was approximated using equations 16 and 17. With a correlation coefficient of 0.9988, the applied polynomial fit of 3<sup>rd </sup>order provides a sufficiently accurate approximation of the expected false-positive rate. The parameters with respect to equation 17 are: <it>u </it>= -1.76847, &#955;<sub>1 </sub>= -0.766775, &#955;<sub>2 </sub>= 0.166677 and &#955;<sub>3 </sub>= -0.0298602. The polynomial fit was calculated using the XMGRACE tool [81].</p>
               </text>
               <graphic file="1745-6150-2-1-8"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Reviewers' comments</p>
         </st>
         <sec>
            <st>
               <p>Reviewer's report I</p>
            </st>
            <p>Erik van Nimwegen, Biozentrum, University of Basel, Switzerland.</p>
            <p>This is a very thorough description of an algorithm for identifying phosphorylation sites of Protein Kinase A (PKA). It is clear that the authors put a lot of effort in deciding which physical features to use and how to use them. I am generally quite convinced that the pkaPS predictor provides the current state-of-the-art for PKA phosporylation site prediction. Therefore this is clearly a very worthwhile paper. I have two main points of criticism:</p>
            <p>&#8226; The paper is too long. I appreciate all the information that the authors provide but I think the paper could be made much more readable by moving a lot of the material to supplementary materials and leaving a much more condensed and structured description of the key points. Right now there is just too much material to wade through for the reader to get a good overview of what is being done.</p>
            <p>Author response: <it>Our previous predictor developments have always been described in a pair of papers &#8211; one for the analysis of the property pattern near the modification site, another for the description and validation of the predictor. Thus, two different but related scientific tasks have been composed into one text. Further, we wish to supply all information that an interested reader can recreate the whole work and the implementation of the predictor. We feel that none of the information provided is dispensable. Nevertheless, we understand the concerns of the reviewer and decided to add an introductory overview section to the Results that summarizes the purpose of the respective sections and the major results described therein</it>.</p>
            <p>&#8226; I have concerns about over-fitting. There are a lot of parameters that go into the method that seem to have been set by hand (actually it is not entirely clear from the text how the parameters were set. This could be better explained). Examples are the collections of <it>&#945; </it>weights and the thresholds <it>t</it><sub><it>j </it></sub>. Given this moderately large set of parameters that have been tuned by the authors one wonders about over-fitting. In the description of the neighbor-jackknife test there is mention of "the parameterization procedure (neighbor-jackknife test, Materials and Methods)" but I did not see any description of this parameterization procedure. To address over-fitting, I propose that the authors do something like randomly dividing both the data set of positive examples as well as the set of negative examples in half. The parameters of the model should then be tuned independently on these two half-sets and false positive/negative rates can then be estimated by applying the two predictors to the half-sets not used in the training.</p>
            <p>Author response: <it>The revised version of the Methods section clarifies that the values </it><b><it>t</it></b><sub><b><it>j </it></b></sub><it>have not been used as adjustable parameters but as a concept to formally unify Gaussian-type physical property terms and fixed penalties. The values </it><b><it>t</it></b><sub><b><it>j </it></b></sub><it>have been described in the legend of Table </it><tblr tid="T6">6</tblr><it>. The only adjustable parameters in the physical property term part of the score are the 14 &#945;</it><sub><b><it>ppt,j</it></b></sub><it>. These are listed in </it>Table <tblr tid="T6">6</tblr><it> and the procedure for their determination is specified in the legend of Table </it><tblr tid="T6">6</tblr><it>. The exact values of the &#945;</it><sub><b><it>ppt,j</it></b></sub><it> are not critical since the physical property terms never generate a positive score (their purpose is to penalize non-permissive queries) and it is only important that the physical property terms do generate values close to zero for most of the learning set sequences. In the initial versions of the score function, each physical property term was even checked individually against the learning set to find maximal values for the &#945;</it><sub><b><it>ppt,j</it></b></sub><it>. Simple linear kernel support vector machines were used to obtain optimized guesses and, thus, to reduce the size of the twilight zone, a zone of scores indicating unclear hits. The question of parameter overfitting has already been answered in the neighbor jack-knife test when whole homologous groups of sequences have been taken out of the learning set. This is a more rigorous approach compared with the random selection of sequences since the score function can be biased already due to the occurrence of a single homologue in the set</it>.</p>
            <p>&#8226; page 3: "a prototypic model for the kinase group". In what sense is PKA prototypic?</p>
            <p>Author response:<it>The reformulation emphasizes that PKA is the one of the best studied kinases and, therefore, well suited for substrate site predictor development</it>.</p>
            <p>&#8226; page 30: "The fact that phosphorylation frequently occurs .... is shown in table <tblr tid="T4">4</tblr>." I don't understand the reasoning here. In table <tblr tid="T5">5</tblr> the number of phosphorylation sites per protein just seems to fall exponentially suggesting that the distribution might simple be a Poisson distribution, i.e no particular bias toward having multiple sites per protein.</p>
            <p>Author response:<it>The respective paragraph has been expanded to clarify that we just wish to describe the phosphorylation site distribution of the learning set. We do not intend to postulate a specific bias except for the observation that, if multiple sites do occur in one protein, they tend to cluster together (see Results)</it>.</p>
            <p>&#8226; Page 33: Quantity <it>R</it>(<it>l</it>). Is this quantity ever used in the predictor? If not, what is the use of introducing it here?</p>
            <p>Author response: <it>The equations described in the Methods section "Sequence analysis part 2. Derivation of physical property characteristics" are used to filter the physical property pattern (see first three sections of the Results) prior to predictor development. We added text to this section to clarify this issue</it>.</p>
            <p>&#8226; page 35: "using the PSIC algorithm..." I am confused because on page 31 it was mentioned that the "sum of mismatches" method of Vingron and Argos is used.</p>
            <p>Author response:<it>Redundancy removal due to the occurrence of homologous sequences in the learning set is carried out differently for the physical property terms (with a modification of the Vingron-Argos procedure </it><abbrgrp><abbr bid="B74">74</abbr><abbr bid="B75">75</abbr></abbrgrp><it>) and for the profile term (with the PSIC method </it><abbrgrp><abbr bid="B77">77</abbr></abbrgrp><it>). The PSIC method is more sensitive but requires independent consideration of alignment positions. In physical property terms, we regularly consider multiple positions and the PSIC concept is formally not applicable in this context</it>.</p>
            <p>&#8226; page 39: "<it>S</it><sub><it>th </it></sub>is a polynomial..." The expression in (16) is not a polynomial.</p>
            <p>Author response:<it>True</it>, <b><it>f</it></b>(<b><it>S</it></b><sub><b><it>th</it></b></sub>)<it> is the polynomial function considered here. We reformulated the respective part</it>.</p>
         </sec>
         <sec>
            <st>
               <p>Reviewer's report II</p>
            </st>
            <p>Sandor Pongor, International Centre for Genetic Engineering and Biotechnology, Trieste, Italy.</p>
            <p>The manuscript of Neuberger et al "pkaPS: Prediction of Protein Kinase A Phosphorylation Sites with the Simplified Kinase-Substrate Binding Model" describes an heuristic method for describing PKA phosphorylation sites based on the distribution of various physicochemical parameters in the region flanking the phosphorylated residue as well as information on the foldedness of the polypeptide region. They present a scoring function that can confidently discriminate PKA phosphorylation sites from S/T residues in other environments. The predictor is made publicly available on a website. The description of the work is detailed and reproducible, and is in line with the groups previous works on similar subjects. The improvement over the other existing methods is convincing, and the idea of combining a physically reasonable model with statistical learning is an attractive one. I suggest the manuscript be published without modifications.</p>
         </sec>
         <sec>
            <st>
               <p>Reviewer's report III</p>
            </st>
            <p>Igor Zhulin, University of Tennessee, Oak Ridge National Laboratory, USA.</p>
            <p>In this paper, authors present the development of a prediction tool termed "pkaPS" for the purpose of identifying substrate proteins for the serine/threonine kinase PKA. Through a very thorough sequence/structure analysis, authors built a PKA-specific binding motif model, which can discriminate between PKA phosphorylation sites and other potential serine/threonine sites.</p>
            <p>In my opinion, both the strength and the weakness of this paper are in its very detailed format. The manuscript is quite long and jam-packed with information even though the authors moved most technical details into the methods section. I am sure that bioinformaticians will find this paper very interesting, whereas most biologists are unlikely to reach the third page of the results section. This is unfortunate, because some of the derived predictions would be quite interesting for them (see below). I recommend adding information on biologically relevant predictions to the abstract at the expense of some technical details. This may capture attention of those to whom this information is addressed.</p>
            <p>While skipping some details, I managed to follow authors' logic, which eventually resulted in building the analytical model for the kinase binding motif. I have to admit that this is a very difficult and noble task. The tendency to produce large numbers of false positives is a signature of most "sequence-only" motif predictors, and any attempt to overcome this problem inevitably leads to the need to incorporate chemistry and structure into the model. The authors did just that.</p>
            <p>Ultimately, the success of the new predictive method and associated tool will be measured by the number of correctly identified targets (although shown measures of sensitivity and specificity are important). Authors indicate that when applied to the human proteome, the predictor ranked most highly protein families that are known PKA targets, such as histone H2A. I found it very intriguing that the top scorers also include ribosomal proteins L21e, L22, L6 that are not known to undergo phosphorylation or interact with protein kinases. However, there is a new body of evidence that some ribosomal proteins, for example S6, can be phosphorylated by specific protein kinases (Ruvinsky &amp; Meyuhas, 2006 Trends Biochem Sci 31:342-8). Thus, predictions look very exciting and indeed produce testable hypotheses that might lead to novel discoveries in eukaryotic signal transduction.</p>
            <p>Author response: <it>Similar to the first reviewer, this referee expresses his concern with respect to readability of the article. We think that the new introductory overview section of the Results removes these concerns. We are grateful for the hint to the Ruvinsky &amp; Meyuhas article that supports some of the predictions in this work. We complement the summary with some of our biological results</it>.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p>The prediction tool is available as WWW server at <url>http://mendel.imp.univie.ac.at/sat/pkaPS/</url> and it is thought for fair use. Please contact the authors if large sets (>500 sequences) are planned to be analyzed. The access is possible with any web-browsing tool.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interest</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>GN and FE designed the study and evaluated the results. GN and GS carried out the programming and the sequence analytic work. All authors participated in drafting the manuscript and approved the final version.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>The authors are grateful for generous support from Boehringer Ingelheim. This project has been partly funded by the Austrian Gen-AU bioinformatics integration network (BIN phase II) sponsored by BM-BWK. The computational facilities have been supported by SUN Microsystems, Inc. within an academic Center of Excellence.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Functional analysis of protein kinase networks in living cells: beyond "knock-outs" and "knock-downs"</p>
            </title>
            <aug>
               <au>
                  <snm>Madhani</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Methods</source>
            <pubdate>2006</pubdate>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">16884918</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Crystal-Structure of the Catalytic Subunit of Cyclic Adenosine-Monophosphate Dependent Protein-Kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Knighton</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Teneyck</snm>
                  <fnm>LF</fnm>
               </au>
               <au>
                  <snm>Ashford</snm>
                  <fnm>VA</fnm>
               </au>
               <au>
                  <snm>Xuong</snm>
                  <fnm>NH</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Sowadski</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1991</pubdate>
            <volume>253</volume>
            <fpage>407</fpage>
            <lpage>414</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1862342</pubid>
                  <pubid idtype="pmpid" link="fulltext">1862342</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Structure of A Peptide Inhibitor Bound to the Catalytic Subunit of Cyclic Adenosine-Monophosphate Dependent Protein-Kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Knighton</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Zheng</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Teneyck</snm>
                  <fnm>LF</fnm>
               </au>
               <au>
                  <snm>Xuong</snm>
                  <fnm>NH</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Sowadski</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1991</pubdate>
            <volume>253</volume>
            <fpage>414</fpage>
            <lpage>420</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.1862343</pubid>
                  <pubid idtype="pmpid" link="fulltext">1862343</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Mechanistic Studies of Camp-Dependent Protein-Kinase Action</p>
            </title>
            <aug>
               <au>
                  <snm>Bramson</snm>
                  <fnm>HN</fnm>
               </au>
               <au>
                  <snm>Kaiser</snm>
                  <fnm>ET</fnm>
               </au>
               <au>
                  <snm>Mildvan</snm>
                  <fnm>AS</fnm>
               </au>
            </aug>
            <source>Crc Critical Reviews in Biochemistry</source>
            <pubdate>1984</pubdate>
            <volume>15</volume>
            <fpage>93</fpage>
            <lpage>124</lpage>
            <xrefbib>
               <pubid idtype="pmpid">6365450</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Role of Multiple Basic Residues in Determining Substrate-Specificity of Cyclic Amp-Dependent Protein-Kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Kemp</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Graves</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Benjamini</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Krebs</snm>
                  <fnm>EG</fnm>
               </au>
            </aug>
            <source>Journal of Biological Chemistry</source>
            <pubdate>1977</pubdate>
            <volume>252</volume>
            <fpage>4888</fpage>
            <lpage>4894</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">194899</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>Multiple Pathway Signal-Transduction by the Camp-Dependent Protein-Kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Walsh</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Vanpatten</snm>
                  <fnm>SM</fnm>
               </au>
            </aug>
            <source>Faseb Journal</source>
            <pubdate>1994</pubdate>
            <volume>8</volume>
            <fpage>1227</fpage>
            <lpage>1236</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8001734</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Structural basis for peptide binding in protein kinase A - Role of glutamic acid 203 and tyrosine 204 in the peptide-positioning loop</p>
            </title>
            <aug>
               <au>
                  <snm>Moore</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>SS</fnm>
               </au>
            </aug>
            <source>Journal of Biological Chemistry</source>
            <pubdate>2003</pubdate>
            <volume>278</volume>
            <fpage>10613</fpage>
            <lpage>10618</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M210807200</pubid>
                  <pubid idtype="pmpid" link="fulltext">12499371</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Sequence and structure-based prediction of eukaryotic protein phosphorylation sites</p>
            </title>
            <aug>
               <au>
                  <snm>Blom</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Gammeltoft</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Journal of Molecular Biology</source>
            <pubdate>1999</pubdate>
            <volume>294</volume>
            <fpage>1351</fpage>
            <lpage>1362</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1999.3310</pubid>
                  <pubid idtype="pmpid" link="fulltext">10600390</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>The PROSITE database, its status in 2002</p>
            </title>
            <aug>
               <au>
                  <snm>Falquet</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Pagni</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hulo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sigrist</snm>
                  <fnm>CJA</fnm>
               </au>
               <au>
                  <snm>Hofmann</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Research</source>
            <pubdate>2002</pubdate>
            <volume>30</volume>
            <fpage>235</fpage>
            <lpage>238</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99105</pubid>
                  <pubid idtype="pmpid" link="fulltext">11752303</pubid>
                  <pubid idtype="doi">10.1093/nar/30.1.235</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>PROSITE: a documented database using patterns and profiles as motif descriptors</p>
            </title>
            <aug>
               <au>
                  <snm>Sigrist</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Cerutti</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Hulo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Gattiker</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Falquet</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Pagni</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Brief Bioinform</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <fpage>265</fpage>
            <lpage>274</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bib/3.3.265</pubid>
                  <pubid idtype="pmpid" link="fulltext">12230035</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>Recent improvements to the PROSITE database</p>
            </title>
            <aug>
               <au>
                  <snm>Hulo</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sigrist</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Le</snm>
                  <fnm>S</fnm>
                  <suf>V</suf>
               </au>
               <au>
                  <snm>Langendijk-Genevaux</snm>
                  <fnm>PS</fnm>
               </au>
               <au>
                  <snm>Bordoli</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Gattiker</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>De Castro</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Bucher</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D134</fpage>
            <lpage>D137</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308778</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681377</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh044</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Blom</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Sicheritz-Ponten</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Gupta</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Gammeltoft</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2004</pubdate>
            <volume>4</volume>
            <fpage>1633</fpage>
            <lpage>1649</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/pmic.200300771</pubid>
                  <pubid idtype="pmpid" link="fulltext">15174133</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Scansite 2.0: Proteome-wide prediction of cell signaling interactions using short sequence motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Obenauer</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Cantley</snm>
                  <fnm>LC</fnm>
               </au>
               <au>
                  <snm>Yaffe</snm>
                  <fnm>MB</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>3635</fpage>
            <lpage>3641</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168990</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824383</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg584</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Prediction of phosphorylation sites using SVMs</p>
            </title>
            <aug>
               <au>
                  <snm>Kim</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Lee</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Oh</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Kimm</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Koh</snm>
                  <fnm>I</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>3179</fpage>
            <lpage>3184</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth382</pubid>
                  <pubid idtype="pmpid" link="fulltext">15231530</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B15">
            <title>
               <p>GPS: a novel group-based phosphorylation predicting and scoring method</p>
            </title>
            <aug>
               <au>
                  <snm>Zhou</snm>
                  <fnm>FF</fnm>
               </au>
               <au>
                  <snm>Xue</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>X</fnm>
               </au>
            </aug>
            <source>Biochem Biophys Res Commun</source>
            <pubdate>2004</pubdate>
            <volume>325</volume>
            <fpage>1443</fpage>
            <lpage>1448</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.bbrc.2004.11.001</pubid>
                  <pubid idtype="pmpid" link="fulltext">15555589</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>GPS: a comprehensive www server for phosphorylation sites prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Xue</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zhou</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Zhu</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Ahmed</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Yao</snm>
                  <fnm>X</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>W184</fpage>
            <lpage>W187</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160154</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980451</pubid>
                  <pubid idtype="doi">10.1093/nar/gki393</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Prediction of potential GPI-modification sites in proprotein sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1999</pubdate>
            <volume>292</volume>
            <fpage>741</fpage>
            <lpage>758</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.1999.3069</pubid>
                  <pubid idtype="pmpid" link="fulltext">10497036</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Glycosylphosphatidylinositol lipid anchoring of plant proteins. Sensitive prediction from sequence- and genome-wide studies for Arabidopsis and rice</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Wildpaner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schultz</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Borner</snm>
                  <fnm>GH</fnm>
               </au>
               <au>
                  <snm>Dupree</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Plant Physiol</source>
            <pubdate>2003</pubdate>
            <volume>133</volume>
            <fpage>1691</fpage>
            <lpage>1701</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">300724</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681532</pubid>
                  <pubid idtype="doi">10.1104/pp.103.023580</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>A sensitive predictor for potential GPI lipid modification sites in fungal protein sequences and its application to genome-wide studies for Aspergillus nidulans, Candida albicans, Neurospora crassa, Saccharomyces cerevisiae and Schizosaccharomyces pombe</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wildpaner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2004</pubdate>
            <volume>337</volume>
            <fpage>243</fpage>
            <lpage>253</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.jmb.2004.01.025</pubid>
                  <pubid idtype="pmpid" link="fulltext">15003443</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>N-terminal N-myristoylation of proteins: prediction of substrate proteins from amino acid sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>317</volume>
            <fpage>541</fpage>
            <lpage>557</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2002.5426</pubid>
                  <pubid idtype="pmpid" link="fulltext">11955008</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Refinement and prediction of protein prenylation motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2005</pubdate>
            <volume>6</volume>
            <fpage>R55</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1175975</pubid>
                  <pubid idtype="pmpid" link="fulltext">15960807</pubid>
                  <pubid idtype="doi">10.1186/gb-2005-6-6-r55</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Prediction of peroxisomal targeting signal 1 containing proteins from amino acid sequence</p>
            </title>
            <aug>
               <au>
                  <snm>Neuberger</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hartig</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>328</volume>
            <fpage>581</fpage>
            <lpage>592</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0022-2836(03)00319-X</pubid>
                  <pubid idtype="pmpid" link="fulltext">12706718</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Sequence properties of GPI-anchored proteins near the omega-site: constraints for the polypeptide binding site of the putative transamidase</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1998</pubdate>
            <volume>11</volume>
            <fpage>1155</fpage>
            <lpage>1161</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/11.12.1155</pubid>
                  <pubid idtype="pmpid" link="fulltext">9930665</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>Motif refinement of the peroxisomal targeting signal 1 and evaluation of taxon-specific differences</p>
            </title>
            <aug>
               <au>
                  <snm>Neuberger</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Hartig</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2003</pubdate>
            <volume>328</volume>
            <fpage>567</fpage>
            <lpage>579</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0022-2836(03)00318-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">12706717</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Use of an oriented peptide library to determine the optimal substrates of protein kinases</p>
            </title>
            <aug>
               <au>
                  <snm>Songyang</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Blechner</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hoagland</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Hoekstra</snm>
                  <fnm>MF</fnm>
               </au>
               <au>
                  <snm>Piwnica-Worms</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Cantley</snm>
                  <fnm>LC</fnm>
               </au>
            </aug>
            <source>Curr Biol</source>
            <pubdate>1994</pubdate>
            <volume>4</volume>
            <fpage>973</fpage>
            <lpage>982</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0960-9822(00)00221-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">7874496</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>N-terminal N-myristoylation of proteins: refinement of the sequence motif and its taxon-specific differences</p>
            </title>
            <aug>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2002</pubdate>
            <volume>317</volume>
            <fpage>523</fpage>
            <lpage>540</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2002.5425</pubid>
                  <pubid idtype="pmpid" link="fulltext">11955007</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>The importance of intrinsic disorder for protein phosphorylation</p>
            </title>
            <aug>
               <au>
                  <snm>Iakoucheva</snm>
                  <fnm>LM</fnm>
               </au>
               <au>
                  <snm>Radivojac</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Brown</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>TR</fnm>
               </au>
               <au>
                  <snm>Sikes</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Obradovic</snm>
                  <fnm>Z</fnm>
               </au>
               <au>
                  <snm>Dunker</snm>
                  <fnm>AK</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>1037</fpage>
            <lpage>1049</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">373391</pubid>
                  <pubid idtype="pmpid" link="fulltext">14960716</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh253</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Tomii</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1996</pubdate>
            <volume>9</volume>
            <fpage>27</fpage>
            <lpage>36</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/9.1.27</pubid>
                  <pubid idtype="pmpid" link="fulltext">9053899</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Three-dimensional structure of membrane and surface proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenberg</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Annu Rev Biochem</source>
            <pubdate>1984</pubdate>
            <volume>53</volume>
            <fpage>595</fpage>
            <lpage>623</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.bi.53.070184.003115</pubid>
                  <pubid idtype="pmpid" link="fulltext">6383201</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Accuracy of protein flexibility predictions</p>
            </title>
            <aug>
               <au>
                  <snm>Vihinen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Torkkila</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Riikonen</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1994</pubdate>
            <volume>19</volume>
            <fpage>141</fpage>
            <lpage>149</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340190207</pubid>
                  <pubid idtype="pmpid">8090708</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Optimal spatial requirements for the location of basic residues in peptide substrates for the cyclic AMP-dependent protein kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Feramisco</snm>
                  <fnm>JR</fnm>
               </au>
               <au>
                  <snm>Glass</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Krebs</snm>
                  <fnm>EG</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1980</pubdate>
            <volume>255</volume>
            <fpage>4240</fpage>
            <lpage>4245</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">6246116</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Local interactions as a structure determinant for protein molecules: II</p>
            </title>
            <aug>
               <au>
                  <snm>Krigbaum</snm>
                  <fnm>WR</fnm>
               </au>
               <au>
                  <snm>Komoriya</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Biochim Biophys Acta</source>
            <pubdate>1979</pubdate>
            <volume>576</volume>
            <fpage>204</fpage>
            <lpage>248</lpage>
            <xrefbib>
               <pubid idtype="pmpid">760806</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Distinct character in hydrophobicity of amino acid compositions of mitochondrial proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Nakashima</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Nishikawa</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ooi</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Proteins</source>
            <pubdate>1990</pubdate>
            <volume>8</volume>
            <fpage>173</fpage>
            <lpage>178</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/prot.340080207</pubid>
                  <pubid idtype="pmpid">2235995</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <title>
               <p>Amino-Acid Preferences for Secondary Structure Vary with Protein Class</p>
            </title>
            <aug>
               <au>
                  <snm>Geisow</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>RDB</fnm>
               </au>
            </aug>
            <source>International Journal of Biological Macromolecules</source>
            <pubdate>1980</pubdate>
            <volume>2</volume>
            <fpage>387</fpage>
            <lpage>389</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1016/0141-8130(80)90023-9</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Local hydrophobicity stabilizes secondary structures in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Kanehisa</snm>
                  <fnm>MI</fnm>
               </au>
               <au>
                  <snm>Tsong</snm>
                  <fnm>TY</fnm>
               </au>
            </aug>
            <source>Biopolymers</source>
            <pubdate>1980</pubdate>
            <volume>19</volume>
            <fpage>1617</fpage>
            <lpage>1628</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/bip.1980.360190906</pubid>
                  <pubid idtype="pmpid">7426680</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B36">
            <source>Handbook of Biochemistry and Molecular Biology</source>
            <publisher>CRC Press</publisher>
            <editor>Fasman GD</editor>
            <pubdate>1976</pubdate>
         </bibl>
         <bibl id="B37">
            <title>
               <p>Studies on the phosphorylation of myelin basic protein by protein kinase C and adenosine 3':5'-monophosphate-dependent protein kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Kishimoto</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Nishiyama</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Nakanishi</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Uratsuji</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nomura</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Takeyama</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Nishizuka</snm>
                  <fnm>Y</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1985</pubdate>
            <volume>260</volume>
            <fpage>12492</fpage>
            <lpage>12499</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2413024</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B38">
            <title>
               <p>Rat liver fructose-1,6-bisphosphatase. Identification of serine 338 as a third major phosphorylation site for cyclic AMP-dependent protein kinase and activity changes associated with multisite phosphorylation in vitro</p>
            </title>
            <aug>
               <au>
                  <snm>Ekdahl</snm>
                  <fnm>KN</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1987</pubdate>
            <volume>262</volume>
            <fpage>16699</fpage>
            <lpage>16703</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2824503</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B39">
            <title>
               <p>Protein kinase A phosphorylates cyclin D1 at three distinct sites within the cyclin box and at the C-terminus</p>
            </title>
            <aug>
               <au>
                  <snm>Sewing</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Muller</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Oncogene</source>
            <pubdate>1994</pubdate>
            <volume>9</volume>
            <fpage>2733</fpage>
            <lpage>2736</lpage>
            <xrefbib>
               <pubid idtype="pmpid">8058338</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B40">
            <title>
               <p>NCBI FTP-site</p>
            </title>
            <url>ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/protein/protein.fa</url>
         </bibl>
         <bibl id="B41">
            <title>
               <p>Hidden localization motifs: naturally occurring peroxisomal targeting signals in non-peroxisomal proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Neuberger</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Kunze</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Berger</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Hartig</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Brocard</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>R97</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">545800</pubid>
                  <pubid idtype="pmpid" link="fulltext">15575971</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-12-r97</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B42">
            <title>
               <p>Machine learning approaches for the prediction of signal peptides and other protein sorting signals</p>
            </title>
            <aug>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>von Heijne</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1999</pubdate>
            <volume>12</volume>
            <fpage>3</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/12.1.3</pubid>
                  <pubid idtype="pmpid" link="fulltext">10065704</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B43">
            <title>
               <p>MYRbase: analysis of genome-wide glycine myristoylation enlarges the functional spectrum of eukaryotic myristoylated proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gouda</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Novatchkova</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Schleiffer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schneider</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Sirota</snm>
                  <fnm>FL</fnm>
               </au>
               <au>
                  <snm>Wildpaner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hayashi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Genome Biol</source>
            <pubdate>2004</pubdate>
            <volume>5</volume>
            <fpage>R21</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">395771</pubid>
                  <pubid idtype="pmpid" link="fulltext">15003124</pubid>
                  <pubid idtype="doi">10.1186/gb-2004-5-3-r21</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B44">
            <title>
               <p>Species specific membrane anchoring of nyctalopin, a small leucine-rich repeat protein</p>
            </title>
            <aug>
               <au>
                  <snm>O'Connor</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Dalley</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Missen</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Bulleid</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bishop</snm>
                  <fnm>PN</fnm>
               </au>
               <au>
                  <snm>Trump</snm>
                  <fnm>D</fnm>
               </au>
            </aug>
            <source>Hum Mol Genet</source>
            <pubdate>2005</pubdate>
            <volume>14</volume>
            <fpage>1877</fpage>
            <lpage>1887</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/hmg/ddi194</pubid>
                  <pubid idtype="pmpid" link="fulltext">15905181</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B45">
            <title>
               <p>Graph Clustering by Flow Simulation</p>
            </title>
            <aug>
               <au>
                  <snm>Dongen</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <publisher>University of Utrecht</publisher>
            <pubdate>2005</pubdate>
            <url>http://micans.org/mcl/</url>
         </bibl>
         <bibl id="B46">
            <title>
               <p>Application of a sensitive collection heuristic for very large protein families: evolutionary relationship between adipose triglyceride lipase (ATGL) and classic mammalian lipases</p>
            </title>
            <aug>
               <au>
                  <snm>Schneider</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Neuberger</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Wildpaner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Tian</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Berezovsky</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2006</pubdate>
            <volume>7</volume>
            <fpage>164</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1435942</pubid>
                  <pubid idtype="pmpid" link="fulltext">16551354</pubid>
                  <pubid idtype="doi">10.1186/1471-2105-7-164</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B47">
            <title>
               <p>Ribosomal protein S6 phosphorylation is a determinant of cell size and glucose homeostasis</p>
            </title>
            <aug>
               <au>
                  <snm>Ruvinsky</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Sharon</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Lerer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Cohen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Stolovich-Rain</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nir</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Dor</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Zisman</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Meyuhas</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Genes Dev</source>
            <pubdate>2005</pubdate>
            <volume>19</volume>
            <fpage>2199</fpage>
            <lpage>2211</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1221890</pubid>
                  <pubid idtype="pmpid" link="fulltext">16166381</pubid>
                  <pubid idtype="doi">10.1101/gad.351605</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B48">
            <title>
               <p>Ribosomal protein S6 phosphorylation: from protein synthesis to cell size</p>
            </title>
            <aug>
               <au>
                  <snm>Ruvinsky</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Meyuhas</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Trends Biochem Sci</source>
            <pubdate>2006</pubdate>
            <volume>31</volume>
            <fpage>342</fpage>
            <lpage>348</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.tibs.2006.04.003</pubid>
                  <pubid idtype="pmpid" link="fulltext">16679021</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B49">
            <title>
               <p>The pkaPS WWW-site:</p>
            </title>
            <pubdate>2007</pubdate>
            <url>http://mendel.imp.univie.ac.at/sat/pkaPS</url>
         </bibl>
         <bibl id="B50">
            <title>
               <p>Prediction of sequence signals for lipid post-translational modifications: insights from case studies</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Neuberger</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Proteomics</source>
            <pubdate>2004</pubdate>
            <volume>4</volume>
            <fpage>1614</fpage>
            <lpage>1625</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/pmic.200300781</pubid>
                  <pubid idtype="pmpid" link="fulltext">15174131</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B51">
            <title>
               <p>"Simulated molecular evolution" or computer-generated artifacts?</p>
            </title>
            <aug>
               <au>
                  <snm>Darius</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Rojas</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Biophys J</source>
            <pubdate>1994</pubdate>
            <volume>67</volume>
            <fpage>2120</fpage>
            <lpage>2122</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1225587</pubid>
                  <pubid idtype="pmpid">7858149</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B52">
            <title>
               <p>The Universal Protein Resource (UniProt)</p>
            </title>
            <aug>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Boeckmann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gasteiger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Redaschi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>LS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D154</fpage>
            <lpage>D159</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540024</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608167</pubid>
                  <pubid idtype="doi">10.1093/nar/gki070</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B53">
            <title>
               <p>UniProt: the Universal Protein knowledgebase</p>
            </title>
            <aug>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Wu</snm>
                  <fnm>CH</fnm>
               </au>
               <au>
                  <snm>Barker</snm>
                  <fnm>WC</fnm>
               </au>
               <au>
                  <snm>Boeckmann</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Gasteiger</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Magrane</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Martin</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Natale</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>O'Donovan</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Redaschi</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Yeh</snm>
                  <fnm>LS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D115</fpage>
            <lpage>D119</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308865</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681372</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh131</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B54">
            <title>
               <p>UniProt archive</p>
            </title>
            <aug>
               <au>
                  <snm>Leinonen</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Diez</snm>
                  <fnm>FG</fnm>
               </au>
               <au>
                  <snm>Binns</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Fleischmann</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Lopez</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <fpage>3236</fpage>
            <lpage>3237</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth191</pubid>
                  <pubid idtype="pmpid" link="fulltext">15044231</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B55">
            <title>
               <p>Phospho.ELM: Database of S/T/Y phosphorylation sites</p>
            </title>
            <pubdate>2007</pubdate>
            <url>http://phospho.elm.eu.org/</url>
         </bibl>
         <bibl id="B56">
            <title>
               <p>UNIPROT Protein Sequence Database</p>
            </title>
            <pubdate>2007</pubdate>
            <url>http://www.expasy.uniprot.org/</url>
         </bibl>
         <bibl id="B57">
            <title>
               <p>Methionine or not methionine at the beginning of a protein</p>
            </title>
            <aug>
               <au>
                  <snm>Sherman</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Stewart</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Tsunasawa</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioessays</source>
            <pubdate>1985</pubdate>
            <volume>3</volume>
            <fpage>27</fpage>
            <lpage>31</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/bies.950030108</pubid>
                  <pubid idtype="pmpid">3024631</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B58">
            <title>
               <p>The specificities of yeast methionine aminopeptidase and acetylation of amino-terminal methionine in vivo. Processing of altered iso-1-cytochromes c created by oligonucleotide transformation</p>
            </title>
            <aug>
               <au>
                  <snm>Moerschell</snm>
                  <fnm>RP</fnm>
               </au>
               <au>
                  <snm>Hosokawa</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Tsunasawa</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sherman</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1990</pubdate>
            <volume>265</volume>
            <fpage>19638</fpage>
            <lpage>19643</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2174047</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B59">
            <title>
               <p>Sites phosphorylated in bovine cardiac troponin T and I. Characterization by 31P-NMR spectroscopy and phosphorylation by protein kinases</p>
            </title>
            <aug>
               <au>
                  <snm>Swiderek</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jaquet</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Meyer</snm>
                  <fnm>HE</fnm>
               </au>
               <au>
                  <snm>Schachtele</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Hofmann</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Heilmeyer</snm>
                  <fnm>LM</fnm>
                  <suf>Jr.</suf>
               </au>
            </aug>
            <source>Eur J Biochem</source>
            <pubdate>1990</pubdate>
            <volume>190</volume>
            <fpage>575</fpage>
            <lpage>582</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1111/j.1432-1033.1990.tb15612.x</pubid>
                  <pubid idtype="pmpid">2373082</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B60">
            <title>
               <p>A common motif of two adjacent phosphoserines in bovine, rabbit and human cardiac troponin I</p>
            </title>
            <aug>
               <au>
                  <snm>Mittmann</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Jaquet</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Heilmeyer</snm>
                  <fnm>LM</fnm>
                  <suf>Jr.</suf>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1990</pubdate>
            <volume>273</volume>
            <fpage>41</fpage>
            <lpage>45</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0014-5793(90)81046-Q</pubid>
                  <pubid idtype="pmpid" link="fulltext">2226863</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B61">
            <title>
               <p>Substrate specificity of the cyclic AMP-dependent protein kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Kemp</snm>
                  <fnm>BE</fnm>
               </au>
               <au>
                  <snm>Bylund</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Krebs</snm>
                  <fnm>EG</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1975</pubdate>
            <volume>72</volume>
            <fpage>3448</fpage>
            <lpage>3452</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">433011</pubid>
                  <pubid idtype="pmpid">1059131</pubid>
                  <pubid idtype="doi">10.1073/pnas.72.9.3448</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B62">
            <title>
               <p>Protein kinase A-dependent phosphorylation of GLUT2 in pancreatic beta cells</p>
            </title>
            <aug>
               <au>
                  <snm>Thorens</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Deriaz</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Bosco</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>DeVos</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Pipeleers</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Schuit</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Meda</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Porret</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1996</pubdate>
            <volume>271</volume>
            <fpage>8075</fpage>
            <lpage>8081</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.271.14.8075</pubid>
                  <pubid idtype="pmpid" link="fulltext">8626492</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B63">
            <title>
               <p>Phosphorylation of the cytoplasmic tail of the PTH/PTHrP receptor</p>
            </title>
            <aug>
               <au>
                  <snm>Blind</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Bambino</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>ZM</fnm>
               </au>
               <au>
                  <snm>Bliziotes</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Nissenson</snm>
                  <fnm>RA</fnm>
               </au>
            </aug>
            <source>J of Bone Miner Res</source>
            <pubdate>1996</pubdate>
            <volume>11</volume>
            <fpage>578</fpage>
            <lpage>586</lpage>
         </bibl>
         <bibl id="B64">
            <title>
               <p>Effect of denaturation on the susceptibility of proteins to enzymic phosphorylation</p>
            </title>
            <aug>
               <au>
                  <snm>Bylund</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Krebs</snm>
                  <fnm>EG</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1975</pubdate>
            <volume>250</volume>
            <fpage>6355</fpage>
            <lpage>6361</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">169238</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B65">
            <title>
               <p>The occurrence of serine phosphate in glycogenin: a possible regulatory site</p>
            </title>
            <aug>
               <au>
                  <snm>Lomako</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Whelan</snm>
                  <fnm>WJ</fnm>
               </au>
            </aug>
            <source>Biofactors</source>
            <pubdate>1988</pubdate>
            <volume>1</volume>
            <fpage>261</fpage>
            <lpage>264</lpage>
            <xrefbib>
               <pubid idtype="pmpid">3151442</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B66">
            <title>
               <p>Cyclic AMP-dependent kinase regulates Raf-1 kinase mainly by phosphorylation of serine 259</p>
            </title>
            <aug>
               <au>
                  <snm>Dhillon</snm>
                  <fnm>AS</fnm>
               </au>
               <au>
                  <snm>Pollock</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Steen</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Shaw</snm>
                  <fnm>PE</fnm>
               </au>
               <au>
                  <snm>Mischak</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Kolch</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2002</pubdate>
            <volume>22</volume>
            <fpage>3237</fpage>
            <lpage>3246</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">133783</pubid>
                  <pubid idtype="pmpid" link="fulltext">11971957</pubid>
                  <pubid idtype="doi">10.1128/MCB.22.10.3237-3246.2002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B67">
            <title>
               <p>Primary structure of the cAMP-dependent phosphorylation site of the plasma membrane calcium pump</p>
            </title>
            <aug>
               <au>
                  <snm>James</snm>
                  <fnm>PH</fnm>
               </au>
               <au>
                  <snm>Pruschy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vorherr</snm>
                  <fnm>TE</fnm>
               </au>
               <au>
                  <snm>Penniston</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Carafoli</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Biochemistry</source>
            <pubdate>1989</pubdate>
            <volume>28</volume>
            <fpage>4253</fpage>
            <lpage>4258</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1021/bi00436a020</pubid>
                  <pubid idtype="pmpid">2548572</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B68">
            <title>
               <p>Identification of the Camp-Dependent Protein-Kinase and Protein-Kinase-C Phosphorylation Sites Within the Major Intracellular Domains of the Beta-1-Subunit, Gamma-2S-Subunit, and Gamma-2L-Subunit of the Gamma-Aminobutyric-Acid Type-A Receptor</p>
            </title>
            <aug>
               <au>
                  <snm>Moss</snm>
                  <fnm>SJ</fnm>
               </au>
               <au>
                  <snm>Doherty</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Huganir</snm>
                  <fnm>RL</fnm>
               </au>
            </aug>
            <source>Journal of Biological Chemistry</source>
            <pubdate>1992</pubdate>
            <volume>267</volume>
            <fpage>14470</fpage>
            <lpage>14476</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">1321150</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B69">
            <title>
               <p>Regulation of melanocortin-4 receptor signaling: agonist-mediated desensitization and internalization</p>
            </title>
            <aug>
               <au>
                  <snm>Shinyama</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Masuzaki</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Fang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Flier</snm>
                  <fnm>JS</fnm>
               </au>
            </aug>
            <source>Endocrinology</source>
            <pubdate>2003</pubdate>
            <volume>144</volume>
            <fpage>1301</fpage>
            <lpage>1314</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1210/en.2002-220931</pubid>
                  <pubid idtype="pmpid" link="fulltext">12639913</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B70">
            <title>
               <p>The neuropeptide processing enzyme EC 3.4.24.15 is modulated by protein kinase A phosphorylation</p>
            </title>
            <aug>
               <au>
                  <snm>Tullai</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Cummins</snm>
                  <fnm>PM</fnm>
               </au>
               <au>
                  <snm>Pabon</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Roberts</snm>
                  <fnm>JL</fnm>
               </au>
               <au>
                  <snm>Lopingco</snm>
                  <fnm>MC</fnm>
               </au>
               <au>
                  <snm>Shrimpton</snm>
                  <fnm>CN</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>AI</fnm>
               </au>
               <au>
                  <snm>Martignetti</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Ferro</snm>
                  <fnm>ES</fnm>
               </au>
               <au>
                  <snm>Glucksman</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>2000</pubdate>
            <volume>275</volume>
            <fpage>36514</fpage>
            <lpage>36522</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1074/jbc.M001843200</pubid>
                  <pubid idtype="pmpid" link="fulltext">10969067</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B71">
            <title>
               <p>Identification of four phosphorylation sites in the N-terminal region of tyrosine hydroxylase</p>
            </title>
            <aug>
               <au>
                  <snm>Campbell</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Hardie</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Vulliet</snm>
                  <fnm>PR</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1986</pubdate>
            <volume>261</volume>
            <fpage>10489</fpage>
            <lpage>10492</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2874140</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B72">
            <title>
               <p>Phosphorylation of human choriogonadotropin. Stoichiometry and sites of phosphate incorporation</p>
            </title>
            <aug>
               <au>
                  <snm>Keutmann</snm>
                  <fnm>HT</fnm>
               </au>
               <au>
                  <snm>Ratanabanangkoon</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Pierce</snm>
                  <fnm>MW</fnm>
               </au>
               <au>
                  <snm>Kitzmann</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Ryan</snm>
                  <fnm>RJ</fnm>
               </au>
            </aug>
            <source>J Biol Chem</source>
            <pubdate>1983</pubdate>
            <volume>258</volume>
            <fpage>14521</fpage>
            <lpage>14526</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">6196363</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B73">
            <title>
               <p>Position-based sequence weights</p>
            </title>
            <aug>
               <au>
                  <snm>Henikoff</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Henikoff</snm>
                  <fnm>JG</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1994</pubdate>
            <volume>243</volume>
            <fpage>574</fpage>
            <lpage>578</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-2836(94)90032-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">7966282</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B74">
            <title>
               <p>A fast and sensitive multiple sequence alignment algorithm</p>
            </title>
            <aug>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Argos</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Comput Appl Biosci</source>
            <pubdate>1989</pubdate>
            <volume>5</volume>
            <fpage>115</fpage>
            <lpage>121</lpage>
            <xrefbib>
               <pubid idtype="pmpid">2720461</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B75">
            <title>
               <p>Weighting in sequence space: a comparison of methods in terms of generalized sequences</p>
            </title>
            <aug>
               <au>
                  <snm>Vingron</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sibbald</snm>
                  <fnm>PR</fnm>
               </au>
            </aug>
            <source>Proc Natl Acad Sci U S A</source>
            <pubdate>1993</pubdate>
            <volume>90</volume>
            <fpage>8777</fpage>
            <lpage>8781</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">47443</pubid>
                  <pubid idtype="pmpid" link="fulltext">8415606</pubid>
                  <pubid idtype="doi">10.1073/pnas.90.19.8777</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B76">
            <aug>
               <au>
                  <snm>Kendall</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Stuart</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>The Advanced Theory of Statistics</source>
            <publisher>Griffen</publisher>
            <pubdate>1977</pubdate>
         </bibl>
         <bibl id="B77">
            <title>
               <p>PSIC: profile extraction from sequence alignments with position-specific counts of independent observations</p>
            </title>
            <aug>
               <au>
                  <snm>Sunyaev</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Rodchenkov</snm>
                  <fnm>IV</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tumanyan</snm>
                  <fnm>VG</fnm>
               </au>
               <au>
                  <snm>Kuznetsov</snm>
                  <fnm>EN</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1999</pubdate>
            <volume>12</volume>
            <fpage>387</fpage>
            <lpage>394</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/12.5.387</pubid>
                  <pubid idtype="pmpid" link="fulltext">10360979</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B78">
            <aug>
               <au>
                  <snm>Hastie</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Tibshirani</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Friedman</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>The Elements of Statistical Learning: Data Mining, Inference and Prediction</source>
            <publisher>Springer Verlag</publisher>
            <pubdate>2001</pubdate>
         </bibl>
         <bibl id="B79">
            <title>
               <p>Post-translational GPI lipid anchor modification of proteins in kingdoms of life: analysis of protein sequence data from complete genomes</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>2001</pubdate>
            <volume>14</volume>
            <fpage>17</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/14.1.17</pubid>
                  <pubid idtype="pmpid" link="fulltext">11287675</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B80">
            <title>
               <p>Prediction of Post-translational modifications from amino acid sequence: Problems, pitfalls, methodological hints</p>
            </title>
            <aug>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Eisenhaber</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Maurer-Stroh</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Bioinformatics and Genomes: Current Perspectives</source>
            <publisher>Wymondham, Horizon Scientific Press</publisher>
            <editor>Andrade MM</editor>
            <edition>1</edition>
            <pubdate>2003</pubdate>
            <volume>5</volume>
            <fpage>81</fpage>
            <lpage>105</lpage>
         </bibl>
         <bibl id="B81">
            <title>
               <p>Issues in searching molecular sequence databases</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Boguski</snm>
                  <fnm>MS</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wootton</snm>
                  <fnm>JC</fnm>
               </au>
            </aug>
            <source>Nat Genet</source>
            <pubdate>1994</pubdate>
            <volume>6</volume>
            <fpage>119</fpage>
            <lpage>129</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1038/ng0294-119</pubid>
                  <pubid idtype="pmpid" link="fulltext">8162065</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B82">
            <title>
               <p>XMGRACE Software Package</p>
            </title>
            <pubdate>2007</pubdate>
            <url>http://plasma-gate.weizmann.ac.il/Grace/</url>
         </bibl>
         <bibl id="B83">
            <title>
               <p>cAMP-dependent protein kinase: crystallographic insights into substrate recognition and phosphotransfer</p>
            </title>
            <aug>
               <au>
                  <cnm>Madhusudan</cnm>
               </au>
               <au>
                  <snm>Trafny</snm>
                  <fnm>EA</fnm>
               </au>
               <au>
                  <snm>Xuong</snm>
                  <fnm>NH</fnm>
               </au>
               <au>
                  <snm>Adams</snm>
                  <fnm>JA</fnm>
               </au>
               <au>
                  <snm>Ten Eyck</snm>
                  <fnm>LF</fnm>
               </au>
               <au>
                  <snm>Taylor</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Sowadski</snm>
                  <fnm>JM</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>1994</pubdate>
            <volume>3</volume>
            <fpage>176</fpage>
            <lpage>187</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8003955</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B84">
            <title>
               <p>VMD: visual molecular dynamics</p>
            </title>
            <aug>
               <au>
                  <snm>Humphrey</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Dalke</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Schulten</snm>
                  <fnm>K</fnm>
               </au>
            </aug>
            <source>J Mol Graph</source>
            <pubdate>1996</pubdate>
            <volume>14</volume>
            <fpage>33</fpage>
            <lpage>38</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0263-7855(96)00018-5</pubid>
                  <pubid idtype="pmpid">8744570</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B85">
            <title>
               <p>The characterization of amino acid sequences in proteins by statistical methods</p>
            </title>
            <aug>
               <au>
                  <snm>Zimmerman</snm>
                  <fnm>JM</fnm>
               </au>
               <au>
                  <snm>Eliezer</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Simha</snm>
                  <fnm>R</fnm>
               </au>
            </aug>
            <source>J Theor Biol</source>
            <pubdate>1968</pubdate>
            <volume>21</volume>
            <fpage>170</fpage>
            <lpage>201</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0022-5193(68)90069-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">5700434</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B86">
            <title>
               <p>T-Coffee: A novel method for fast and accurate multiple sequence alignment</p>
            </title>
            <aug>
               <au>
                  <snm>Notredame</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Heringa</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>2000</pubdate>
            <volume>302</volume>
            <fpage>205</fpage>
            <lpage>217</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1006/jmbi.2000.4042</pubid>
                  <pubid idtype="pmpid" link="fulltext">10964570</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B87">
            <title>
               <p>The amino acid sequences of the phosphorylated sites in troponin-I from rabbit skeletal muscle</p>
            </title>
            <aug>
               <au>
                  <snm>Huang</snm>
                  <fnm>TS</fnm>
               </au>
               <au>
                  <snm>Bylund</snm>
                  <fnm>DB</fnm>
               </au>
               <au>
                  <snm>Stull</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Krebs</snm>
                  <fnm>EG</fnm>
               </au>
            </aug>
            <source>FEBS Lett</source>
            <pubdate>1974</pubdate>
            <volume>42</volume>
            <fpage>249</fpage>
            <lpage>252</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/0014-5793(74)80738-6</pubid>
                  <pubid idtype="pmpid" link="fulltext">4369265</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B88">
            <title>
               <p>Identification of a major cyclic AMP-dependent protein kinase A phosphorylation site within the cytoplasmic tail of the low-density lipoprotein receptor-related protein: implication for receptor-mediated endocytosis</p>
            </title>
            <aug>
               <au>
                  <snm>Li</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>van Kerkhof</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Marzolo</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Strous</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Bu</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Mol Cell Biol</source>
            <pubdate>2001</pubdate>
            <volume>21</volume>
            <fpage>1185</fpage>
            <lpage>1195</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">99572</pubid>
                  <pubid idtype="pmpid" link="fulltext">11158305</pubid>
                  <pubid idtype="doi">10.1128/MCB.21.4.1185-1195.2001</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B89">
            <title>
               <p>Phosphorylation and modulation of a kainate receptor (GluR6) by cAMP-dependent protein kinase</p>
            </title>
            <aug>
               <au>
                  <snm>Wang</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>Taverna</snm>
                  <fnm>FA</fnm>
               </au>
               <au>
                  <snm>Huang</snm>
                  <fnm>XP</fnm>
               </au>
               <au>
                  <snm>MacDonald</snm>
                  <fnm>JF</fnm>
               </au>
               <au>
                  <snm>Hampson</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Science</source>
            <pubdate>1993</pubdate>
            <volume>259</volume>
            <fpage>1173</fpage>
            <lpage>1175</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1126/science.8382377</pubid>
                  <pubid idtype="pmpid" link="fulltext">8382377</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B90">
            <title>
               <p>CDD: a curated Entrez database of conserved domain alignments</p>
            </title>
            <aug>
               <au>
                  <snm>Marchler-Bauer</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Anderson</snm>
                  <fnm>JB</fnm>
               </au>
               <au>
                  <snm>DeWeese-Scott</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fedorova</snm>
                  <fnm>ND</fnm>
               </au>
               <au>
                  <snm>Geer</snm>
                  <fnm>LY</fnm>
               </au>
               <au>
                  <snm>He</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Hurwitz</snm>
                  <fnm>DI</fnm>
               </au>
               <au>
                  <snm>Jackson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Jacobs</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Lanczycki</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Liebert</snm>
                  <fnm>CA</fnm>
               </au>
               <au>
                  <snm>Liu</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Madej</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Marchler</snm>
                  <fnm>GH</fnm>
               </au>
               <au>
                  <snm>Mazumder</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Nikolskaya</snm>
                  <fnm>AN</fnm>
               </au>
               <au>
                  <snm>Panchenko</snm>
                  <fnm>AR</fnm>
               </au>
               <au>
                  <snm>Rao</snm>
                  <fnm>BS</fnm>
               </au>
               <au>
                  <snm>Shoemaker</snm>
                  <fnm>BA</fnm>
               </au>
               <au>
                  <snm>Simonyan</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Song</snm>
                  <fnm>JS</fnm>
               </au>
               <au>
                  <snm>Thiessen</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Vasudevan</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Wang</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Yamashita</snm>
                  <fnm>RA</fnm>
               </au>
               <au>
                  <snm>Yin</snm>
                  <fnm>JJ</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <fpage>383</fpage>
            <lpage>387</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">165534</pubid>
                  <pubid idtype="pmpid" link="fulltext">12520028</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg087</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B91">
            <title>
               <p>The Pfam protein families database</p>
            </title>
            <aug>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Coin</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Durbin</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Finn</snm>
                  <fnm>RD</fnm>
               </au>
               <au>
                  <snm>Hollich</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Griffiths-Jones</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Khanna</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Marshall</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Moxon</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sonnhammer</snm>
                  <fnm>EL</fnm>
               </au>
               <au>
                  <snm>Studholme</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Yeats</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D138</fpage>
            <lpage>D141</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308855</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681378</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh121</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B92">
            <title>
               <p>Volume changes on protein folding</p>
            </title>
            <aug>
               <au>
                  <snm>Harpaz</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Gerstein</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chothia</snm>
                  <fnm>C</fnm>
               </au>
            </aug>
            <source>Structure</source>
            <pubdate>1994</pubdate>
            <volume>2</volume>
            <fpage>641</fpage>
            <lpage>649</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0969-2126(00)00065-4</pubid>
                  <pubid idtype="pmpid">7922041</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B93">
            <title>
               <p>Prediction of Chain Flexibility in Proteins - A Tool for the Selection of Peptide Antigens</p>
            </title>
            <aug>
               <au>
                  <snm>Karplus</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Schulz</snm>
                  <fnm>GE</fnm>
               </au>
            </aug>
            <source>Naturwissenschaften</source>
            <pubdate>1985</pubdate>
            <volume>72</volume>
            <fpage>212</fpage>
            <lpage>213</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1007/BF01195768</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B94">
            <title>
               <p>Hydrophobicity and structural classes in proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Cid</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bunster</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Canales</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gazitua</snm>
                  <fnm>F</fnm>
               </au>
            </aug>
            <source>Protein Eng</source>
            <pubdate>1992</pubdate>
            <volume>5</volume>
            <fpage>373</fpage>
            <lpage>375</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/protein/5.5.373</pubid>
                  <pubid idtype="pmpid" link="fulltext">1518784</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B95">
            <title>
               <p>LIBSVM: a library for support vector machines</p>
            </title>
            <aug>
               <au>
                  <snm>Chang</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Lin</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <pubdate>2006</pubdate>
            <url>http://www.csie.ntu.edu.tw/~cjlin/libsvm/</url>
         </bibl>
      </refgrp>
   </bm>
</art>

