NCBI's interactions with Locus-Specific Data Bases (LSDB)

Overview

Resources available through NCBI that represent information about human variation include the following. Curators of locus-specific databases are encouraged to register their curation results with ClinVar, which will result in registration in dbSNP and dbVar as well.

  • dbGaP, database of Genotypes and Phenotypes. dbGaP archives documents, measured variables, and genotypes from genome-wide association studies. It validates the genotypes received, and computes associations to measured phenotypes. Access to some of the content of this database is controlled, but 
    association data become freely available as embargos are lifted .
  • dbSNP, database of short genetic variation. dbSNP is an archival repository for information about sequence variation.  Despite its acronym, dbSNP is not limited to single nucleotide polymorphisms (SNPs), but also stores information about multiple types of small-scale variation, including insertions, deletions, minisatellites, microsatellites, and non-polymorphic (rare)variation.
  • dbVar, database of genomic structural variation. dbVar archives information about structural variation, including large insertions, deletions, translocations and inversions. dbVar also maintains information about the studies that have identified this variation, and reports the association of defined variants with phenotype information.
  • ClinVar, database of medically-related variation. ClinVar aggregates information about sequence variation and its relationship to human health.
  • OMIM, Online Mendelian Inheritance in Man. OMIM is a manually curated database of human genes and genetic disorders maintained by staff at the Johns Hopkins University. It has long reported information about a subset of variants (Allelic Variants) that may have clinical significance. Athough OMIM is no longer based at NCBI, its content has been made available for supporting searches.
  • RefSeqGene, this site. RefSeqGene is a subset of NCBI's Reference Sequence project. It defines genomic sequences of well-characterized genes to be used as reference standards. These sequences serve as a stable foundation for reporting mutations, for establishing conventions for numbering exons and introns, and for defining the coordinates of other biologically significant variation. The RefSeqGene group collaborates with multiple LSDBs and the LRG project of GEN2PHEN to establish and maintain these standard sequences. The genes for which RefSeqGene project has identified LSDBs can be reviewed by querying Gene using this token loprovlsdb[filter].

How LSDB and other stakeholders can submit variation data

ClinVar supports submission by XML and spreadsheets.  More details are provided at on this page of instructions.

NCBI currently provides two web forms to submit information about human variation. Both depend on describing the variant according to the HGVS standard, i.e. relative to a reference sequence and its version. To use these forms, you must register as a data submitter so you can be credited with your current and future contributions. RefSeqGene sequences facilitate this submission process, because describing the location of intronic variants does not require a decision about the location of a splice junction. 

Human Variation: Search, Annotate, Submit allows you to

  1. enter the description of a single variant,
  2. determine if this variant is already in the database by determining if your entry matches an rs number
  3. contribute novel information, MIM numbers, PubMed uids, and free text
  4. review your submission before making it available

Human Variation: Annotation and Submit Batch Data allows you to provide all the features in the single submission form, for a large number of records, as well as to provide your identifiers for each variant, URLs back to your database, both globally and for each variant id, and the number of independent observations of each variant

In summary, the web sites support submission of the following attributes:

Overall

  1. URL for the submitter organization (LSDB)
  2. Submitter details (contact information)

For each variant described by each submitter

  1. Identifier(s) for associated disease(s) (MIM numbers or MeSH ids)
  2. HGVS name for the variant as defined by public reference sequence (indirectly establishes Entrez GeneID, HGNC ID, MIM number for the gene)
  3. OMIM Allelic variant ID
  4. Submitters ID (and URL)
  5. PubMed id(s)
  6. Number of independent observations
  7. Clinical consequence (or lack thereof)
  8. Whether the variation is germline or somatic or both
  9. Alternate designations for the variation
  10. Free-text comments

How LSDB and other stakeholders can submit variation data to dbVar

dbVar provides detailed instructions on how to submit data here.

How submitters' data are represented at NCBI

ClinVar provides supports retrieival of information by submitter, gene, disorder, and allele.

dbSNP has implemented several tools to facilitate access to detailed information about variants of possible clinical significance. The Variation Viewer Variation View icon  is provided whenever the existence of a variant of clinical significance is detected for a gene. This tool allows the user to navigate to

  1. information about all submitters of information about a variant (submission details tab),
  2. links to one or more LSDB as appropriate (submission details tab)
  3. links to PubMed, OMIM, Gene,
  4. displays of the variant on the genome
  5. and more...

The display is filtered by default to limit the display of variants for a gene to those that were submitted from an LSDB, from OMIM, or associated with a citation. All variants can, however, be displayed by altering the filter choice.

When you try these examples:

  1. TSC2
  2. MECP2

you will note that there is a Submission Details tab at the bottom of the page, where all depositors are acknowledged even if there is more than one.

Another function dbSNP provides is a report of all submissions from a particular submitter. For example, submitting dalgleish on this form results in this report which shows the unique submitter descriptions (handles) that satisfy the query. Clicking on a particular "handle" results in a display like this which includes contact information and links for details of each submission.

Last updated: 2013-07-12T16:45:05-04:00