Table 1

Annotation errors identified, by gene

pur genes

F

D

N

T

La

Sa

Qa

M

K

E

C

B

H1c

P

P2d

H2

O


Protein name errors

Partial misannotation/over-attribution

3

2

1

b

Inappropriately vague name

23

1

1

27

1

4

Not justified due to missing features

14

2

1

4

E. C. number errors

One or more missing

6

13

23

35

1

One or more incorrect/unjustified

14

2

2

2

2

1

1

31

2

Gene structure errors

Start codon mis-called

3

1

Pseudogene label unjustified

1

Gene symbol errors

Incorrect gene symbol

2

1

1

1

Number of genes examined

72

60

31

23

58

51

58

58

28

59

63e

60

13c

46e

31

2f

25


a The three gene products share an EC number.

b Naming of PurE is problematic. Some PurEs are not clearly class I or class II, and some organisms lack a PurK, making a class II type name more appropriate even when PurE appears class I. We counted either a class I or class II-type name as correct in this analysis.

c Halobacteria fusions of PurN and PurH1 are counted under PurN.

d Separate counts were maintained for PurP-like proteins in cluster II. We preferred generic names and no EC number, given a lack of demonstrated function for proteins in this cluster.

e A split or frame-shifted gene was counted as one gene.

f Excludes full-length PurH, counted under PurH1.

Brown et al. Biology Direct 2011 6:63   doi:10.1186/1745-6150-6-63

Open Data