TAFA: a novel secreted family with conserved cysteine residues and restricted expression in the brain
Section snippets
TAFA genes encode small secreted proteins
Our search strategy uses as a starting point EST/cDNA sequences, to which we applied an assembly algorithm that generates sequence contigs [9]. A novel putative gene was defined as a collection of connected contigs with no significant BLAST homology to any known genes. We then clustered these genes by protein sequence similarity using a TBLASTX-based algorithm [2]. Putative coding regions were defined by the TBLASTX alignments. The clusters of candidate proteins were then evaluated for the
Discussion
We have applied a novel strategy to search databases for novel gene families comprising small secreted proteins based upon the following premises: genomic and EST (public and private) databases were searched for transcripts that (1) have no significant homology to known sequences, (2) cluster into multigene families, (3) encode predicted signal sequences but lack transmembrane domains, and (4) have orthologs in other species. Using this method we have identified a new gene family named TAFA.
Sequence assembly and gene discovery
DNA sequences were assembled from internal Nuvelo EST sequences and from sequences available through the NCBI. Similar methods have been used previously by others to estimate the number of independent genes within the human genome [14]. Chromatograms from cDNA clones were obtained using dideoxy sequencing and resolution on ABI 377/3700 sequencers or by downloading chromatograms from the public domain dbEST database. Phred was used for base-calling and to assign quality scores [15], [16]. The
References (40)
- et al.
Chemokines: a new classification system and their role in immunity
Immunity
(2000) An improved sequence assembly program
Genomics
(1996)- et al.
Prediction of complete gene structures in human genomic DNA
J. Mol. Biol.
(1997) - et al.
Inflammation in neurodegenerative disease—a double-edged sword
Neuron
(2002) Cytokine actions in the central nervous system
Cytokine Growth Factor Rev.
(1998)- et al.
cAMP-dependent growth cone guidance by netrin-1
Neuron
(1997) - et al.
Improved tools for biological sequences comparison
Proc. Natl. Acad. Sci. USA
(1988) - W. Pearson, FASTA programs at the Univ. of Virginia: http://fasta.bioch.virginia.edu,...
- et al.
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Res.
(1997) - et al.
Protein database searches for multiple alignments
Proc. Natl. Acad. Sci. USA
(1990)