The high-throughput genome projects have resulted in a
rapid accumulation of genome sequences for a large number of organisms and
large number of genes with unknown function (Hypothetical). To fully realize
the value of the data, scientists need to identify proteins encoded by these
genes and understand how these proteins function in making up a living cell.
With experimentally verified information on protein function lagging far behind,
computational methods are needed for reliable and large-scale functional
annotation of proteins. Functional annotation is the process of identifying for
a given gene its biological function, interaction with other elements,
involvement in metabolic pathways, and any other piece of information that
helps in understanding when and how a gene influences the overall system. On
the other hand, many Biological Processes and Disease mechanisms are still
unknown due to lack of knowledge about the function of the Hypothetical genes
in Human. Once its function is revealed the so called hurdle of unknown
mechanism of the Human Genome can be mastered. Hence, the present study aims to
use computational approaches to annotate the function of hypothetical genes in
Chromosome 2 of Human. The annotation of the hypothetical genes in human
chromosoem2 was done both at the nucleotide and protein level. Among the 41
uncharacterized hypothetical genes in Human chromosome 2, the functions of 27
of them were successfully annotation. Further, experimental validation is
essential to confirm the predicted function.
[1.] Ana Conesa, Stefan Götz, Juan
Miguel García-Gómez, Javier Terol, Manuel Talon and Montserrat Robles,
Blast2GO: a universal tool for annotation, visualization and analysis in
functional genomics research, Bioinformatics, 2005, Volume 21, Issue 18 Pp.
[2.] Bhattacharya, A., Lakhman, S.S.,
Singh, S. (2004). Modulation of L-type calcium channels in Drosophila via a
pituitary adenylyl cyclase-activating polypeptide (PACAP)-mediated pathway. J.
Biol. Chem. 279(36): 37291--37297.
[3.] Chen, Y. and Xu, D. (2003)
Computation analysis of high-throughput protein-protein interaction data. Current
Peptide and Protein Science, 4, 159-181.
[4.] Geer LY, Marchler-Bauer A, Geer
RC, Han L, He J, He S, Liu C, Shi W, Bryant SH. The NCBI BioSystems database. Nucleic
Acids Res. 2010 Jan; 38(Database issue):D492-6. (Epub 2009 Oct 23)
[5.] Human epithelial cells trigger
dendritic cell mediated allergic inflammation by producing TSLP.
[6.] Krogh A, Larsson B, von Heijne G,
Sonnhammer EL Predicting transmembrane protein topology with a hidden Markov
model: application to complete genomes. J Mol Biol. 2001 Jan 19;305(3):567-80.
[7.] Lei Kong, Yong Zhang, Zhi-Qiang
Ye, Xiao-Qiao Liu, Shu-Qi Zhao, Liping Wei and Ge Gao, CPC: assess the
protein-coding potential of transcripts using sequence features and support
vector machine, Nucleic Acids Research, 2007, Volume 35, Issue suppl 2 Pp. W345-W349
[8.] Murzin A. G., Brenner S. E.,
Hubbard T., Chothia C. (1995). SCOP: a structural classification of proteins
database for the investigation of sequences and structures. J. Mol. Biol. 247,
[9.] Nakai K and Horton P. PSORT: a
program for detecting sorting signals in proteins and predicting their
subcellular localization, Trends Biochem Sci. 1999 Jan;24(1):34-6.
[10.] Nat Immunol. 2002
Jul;3(7):673-80. Epub 2002 Jun 10.
[11.] Pandit SB, Gosar D, Abhiman S,
Sujatha S, Dixit SS, Mhatre NS, Sowdhamini R, Srinivasan N, SUPFAM--a database
of potential protein superfamily relationships derived by comparing
sequence-based and structure-based families: implications for structural
genomics and function annotation in genomes. Nucleic Acids Res. 2002 Jan
[12.] Robert D. Finn, Alex Bateman,
Jody Clements, Penelope Coggill ,Ruth Y. Eberhardt, Sean R. Eddy, Andreas
Heger, Kirstie Hetherington, Liisa Holm, Jaina Mistry, Erik L. L. Sonnhammer,
John Tate and Marco Punta, Pfam: the protein families database, Nucleic Acids
Research, 2013, Volume 42, Issue D1, Pp. D222-D230.
[13.] Roy, N. S., Farheen, S., Roy, N.,
Sengupta, S. and Majumder, P. P. (2008), Portability of Tag SNPs Across
Isolated Population Groups: An Example from India. Annals of Human Genetics,
[14.] Shu-Ye Jiang1, Alan
Christoffels2, Rengasamy Ramamoorthy1, and Srinivasan Ramachandran, Expansion
Mechanisms and Functional Annotations of Hypothetical Genes in Rice Genome
Plant Physiology Preview. Published on June 17, 2009, as
[15.] Soumelis V, Reche PA, Kanzler H,
Yuan W, Edward G, Homey B, Gilliet M, Ho S, Antonenko S, Lauerma A, Smith K,
Gorman D, Zurawski S, Abrams J, Menon S, McClanahan T, de Waal-Malefyt Rd R,
Bazan F, Kastelein RA, Liu YJ
[16.] Tobias, J.A., Bates, J.M.,
Hackett, S.J. & Seddon, N. 2008. Comment on the latitudinal gradient in
recent speciation and extinction rates of birds and mammals. Science 319: 901.
[17.] Xie X, Lu J, Kulbokas EJ, Golub
TR, Mootha V, Lindblad-Toh K, Lander ES and Kellis M, Systematic discovery of
regulatory motifs in human promoters and 3' UTRs by comparison of several
mammals, Nature. 2005 Mar 17;434(7031):338-45. Epub 2005 Feb 27.
[18.] Zarembinski, T.I., Hung, L.W., Mueller-Dieckmann,
H.J., Kim, K.K., Yokota, H., Kim, R., and Kim, S.H. 1998. Structure-based
assignment of the biochemical function of a hypothetical protein: A test case
of Structural Genomics. Proc. Natl. Acad. Sci. 95: 15189–15193