Biology Direct

official impact factor 3.74

Open Access Research

Systematic analysis of mRNA 5' coding sequence incompleteness in Danio rerio: an automated EST-based approach

Flavia Frabetti, Raffaella Casadei*, Luca Lenzi, Silvia Canaider, Lorenza Vitale, Federica Facchin, Paolo Carinci, Maria Zannotti and Pierluigi Strippoli

Author Affiliations

Center for Research in Molecular Genetics "Fondazione CARISBO", Department of Histology, Embryology and Applied Biology, University of Bologna, via Belmeloro 8, 40126 Bologna (BO), Italy

For all author emails, please log on.

Biology Direct 2007, 2:34 doi:10.1186/1745-6150-2-34

Published: 27 November 2007

Abstract

Background

All standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. The aim of this work was to estimate mRNA open reading frame (ORF) 5' region sequence completeness in the model organism Danio rerio (zebrafish).

Results

We implemented a novel automated approach (5'_ORF_Extender) that systematically compares available expressed sequence tags (ESTs) with all the zebrafish experimentally determined mRNA sequences, identifies additional sequence stretches at 5' region and scans for the presence of all conditions needed to define a new, extended putative ORF. Our software was able to identify 285 (3.3%) mRNAs with putatively incomplete ORFs at 5' region and, in three example cases selected (selt1a, unc119.2, nppa), the extended coding region at 5' end was cloned by reverse transcription-polymerase chain reaction (RT-PCR).

Conclusion

The implemented method, which could also be useful for the analysis of other genomes, allowed us to describe the relevance of the "5' end mRNA artifact" problem for genomic annotation and functional genomic experiment design in zebrafish.

Open peer review

This article was reviewed by Alexey V. Kochetov (nominated by Mikhail Gelfand), Shamil Sunyaev, and Gáspár Jékely. For the full reviews, please go to the Reviewers' Comments section.