Programs & Activities
NCBI has a multi-disciplinary research group composed of computer scientists, molecular biologists, mathematicians, biochemists, research physicians, and structural biologists concentrating on basic and applied research in computational molecular biology. These investigators not only make important contributions to basic science but also serve as a wellspring of new methods for applied research activities. Together they are studying fundamental biomedical problems at the molecular level using mathematical and computational methods. These problems include gene organization, sequence analysis, and structure prediction. A sampling of current research projects includes: detection and analysis of gene organization, repeating sequence patterns, protein domains and structural elements, creation of a gene map of the human genome, mathematical modeling of the kinetics of HIV infection, analysis of effects of sequencing errors for database searching, development of new algorithms for database searching and multiple sequence alignment, construction of non-redundant sequence databases, mathematical models for estimation of statistical significance of sequence similarity, and vector models for text retrieval. Additionally, NCBI investigators maintain ongoing collaborations with several institutes within the NIH and also with numerous academic and government research laboratories.
Databases and Software
NCBI assumed responsibility for the GenBank DNA sequence database in October 1992. NCBI staff with advanced training in molecular biology build the database from sequences submitted by individual laboratories and by data exchange with the international nucleotide sequence databases, European Molecular Biology Laboratory (EMBL) and the DNA Database of Japan (DDBJ). Arrangements with the U.S. Patent and Trademark Office enable the incorporation of patented sequence data.
In addition to GenBank, NCBI supports and distributes a variety of databases for the medical and scientific communities. These include the Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) of 3D protein structures, the Unique Human Gene Sequence Collection (UniGene), a Gene Map of the Human Genome, the Taxonomy Browser, and the Cancer Genome Anatomy Project (CGAP), in collaboration with the National Cancer Institute.
Entrez is NCBI's search and retrieval system that provides users with integrated access to sequence, mapping, taxonomy, and structural data. Entrez also provides graphical views of sequences and chromosome maps. A powerful and unique feature of Entrez is the ability to retrieve related sequences, structures, and references. The journal literature is available through PubMed, a Web search interface that provides access to over 11 million journal citations in MEDLINE and contains links to full-text articles at participating publishers' Web sites.
BLAST is a program for sequence similarity searching developed at NCBI and is instrumental in identifying genes and genetic features. BLAST can execute sequence searches against the entire DNA database in less than 15 seconds. Additional software tools provided by NCBI include: Open Reading Frame Finder (ORF Finder), Electronic PCR, and the sequence submission tools, Sequin and BankIt. All of NCBI's databases and software tools are available from the WWW or by FTP. NCBI also has email servers that provide an alternative way to access the databases for text searching or sequence similarity searching.
Outreach and Education
NCBI fosters scientific communication in the area of computers, as applied to molecular biology and genetics, by sponsoring meetings, workshops, and lecture series. A Scientific Visitors Program has been established to foster collaborations with extramural scientists. Postdoctoral fellow positions are available as part of the NIH Intramural Research Program.