The Anabaena sensory rhodopsin transducer (ASRT) is a small protein that has been claimed to function as a signaling molecule downstream of the cyanobacterial sensory rhodopsin. However, orthologs of ASRT have been detected in several bacteria that lack rhodopsin, raising questions about the generality of this function. Using sequence profile searches we show that ASRT defines a novel superfamily of beta-sandwich fold domains. Through contextual inference based on domain architectures and predicted operons and structural analysis we present strong evidence that these domains bind small-molecules, most probably sugars. We propose that the intracellular versions like ASRT probably participate as sensors that regulate a diverse range of sugar metabolism operons or even the light sensory behavior in Anabaena by binding sugars or related metabolites. We also show that one of the extracellular versions define a predicted sugar-binding structure in a novel cell-surface lipoprotein found across actinobacteria, including several pathogens such as Tropheryma, Actinomyces and Thermobifida. The analysis of this superfamily also provides new data to investigate the evolution of carbohydrate binding modes in beta-sandwich domains with very different topologies.
MATERIALS AND METHODS Profile-based searches were conducted using the PSI-BLAST and HMMER (Eddy, 1998) programs.
PSI-BLAST (Altschul, et al., 1997) searches were performed against the nonredundant (NR) database of protein sequences
and the non-redundant database of environmental sequences (National Center for Biotechnology Information [NCBI], NIH, Bethesda, MD, USA)
with either a single sequence or an alignment used as query, with the default profile inclusion expectation (e) value set to a threshold
of 0.01. Most searches were iterated until convergence. A statistical correction for compositional bias was used to reduce false positives
(Schaffer, et al., 2001). In order to exhaustively recover all orthologs of the ASRAH domain, sequences recovered by each of these searches
were further used as queries for PSI-BLAST searches and independent HMM-HMM profile alignment searches were also performed against a
non-redundant version of the PDB database using HHpred (Sodding, 2005; Sooding et al. 2005). A single linkage clustering of the retrived
proteins was then obtained using the BLASTCLUST program (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html). A comprehensive multiple
alignment was generated using the KALIGN program (Lassmann and Sonnhammer, 2005) and was further adjusted manually based on PSI-BLAST results,
secondary structure predictions and multiple alignments of individual families. Protein secondary structure was predicted using the JPRED
program (Cuff, et al., 1998) that uses information extracted from a PSSM, HMM, and the input seed alignment. The in-house TASS package
(Anantharaman V, Balaji S, Aravind L; unpublished) was used for the automation of all large-scale sequence analysis procedures, including
domain architectures and gene neighbourhood analysis. The pymol was used to prepare and analyze Thermotoga's ASRAH homolog PDB structure.
References |
2. Comprehensive multiple alignment of the ASRAH domain Please click on the gene name to access the corresponding protein sequence.You may click on the gene glyph to access the corresponding protein sequence.