Background: There are genes whose function remains obscure as they may not have
similarities to known regions in the genome. Such known ‘unknown’ genes constituting the Open
Reading Frames (ORF) that remain in the epigenome are termed as orphan genes and the proteins
encoded by them but having no experimental evidence of translation are termed as ‘Hypothetical
Objectives: We have enhanced our former database of Hypothetical Proteins (HP) in human (HypoDB)
with added annotation, application programming interfaces and descriptive features. The
database hosts 1000+ manually curated records of the known ‘unknown’ regions in the human genome.
The new updated version of HypoDB with functionalities (Blast, Match) is freely accessible
Methods: The total collection of HPs were checked using experimentally validated sets (from
Swiss-Prot) or non-experimentally validated set (TrEMBL) or the complete set (UniProtKB). The
database was designed with java at the core backend, integrated with databases, viz. EMBL, PIR,
HPRD and those including descriptors for structural databases, interaction and association databases.
Results: The HypoDB constituted Application Programming Interfaces (API) for implicitly searching
resources linking them to other databases like NCBI Link-out in addition to multiple search
capabilities along with advanced searches using integrated bio-tools, viz. Match and BLAST were
Conclusion: The HypoDB is perhaps the only open-source HP database with a range of tools for
common bioinformatics retrievals and serves as a standby reference to researchers who are interested
in finding candidate sequences for their potential experimental work.