Log on / register
BioMed Central home | Journals A-Z | Feedback | Support | My details
Open AccessResearch

Parameters of proteome evolution from histograms of amino-acid sequence identities of paralogous proteins

Jacob Bock Axelsen1,2 email, Koon-Kiu Yan2,3 email and Sergei Maslov2,3 email

Center for Models of Life, Niels Bohr Institute, Blegdamsvej 17, DK-2100, Copenhagen Ø, Denmark

Department of Condensed Matter Physics and Materials Science, Brookhaven National Laboratory, Upton, New York 11973, USA

Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA

author email corresponding author email

Biology Direct 2007, 2:32doi:10.1186/1745-6150-2-32

Published: 26 November 2007

Additional files

Additional file 1:

The overall shape of the PID histogram is independent of the alignment algorithm and the E-value cutoff. The PID histogram Na(p) in the fly (D. melanogaster genomes when pairs of paralogous proteins were detected using the blastp algorithm [1] with E-value cutoff of 10-10 (filled circles) and 10-30 (open diamonds). The inset shows the ratio of these two histograms, which is very close to 1 for p > 40%. Thus the overall shape of Na(p) in most of the Region II (Fig. 1) is nearly + cutoff independent. The Na(p) also is insensitive to a particular algorithm used to align the pairs. Indeed, when paralogous pairs detected by the blastp with the E-value cutoff of 10-10 (filled circles) were realigned using the Smith-Waterman algorithm [28] the resulting distribution (blue stars) changed very little.

Format: PDF Size: 18KB Download file

This file can be viewed with: Adobe Acrobat Reader

Additional file 2:

The quadratic scaling of the total number of paralogous pairs with the number of genes in the genome. The total number of paralogous pairs ∑pNa(p) generated by the all-to-all alignment of all protein sequences encoded in the genome (the y-axis) scales as the square of the total number Ngenes of protein-coding genes in the genome. Solid symbols are six model organisms used in our study. The solid line has the slope 2 on this log-log plot.

Format: PDF Size: 12KB Download file

This file can be viewed with: Adobe Acrobat Reader


© 1999-2010 BioMed Central Ltd unless otherwise stated. Part of Springer Science+Business Media.