Information about searches supported by the allele registry
Type of search
Search the allele registry using HGVS expression. The allele registry supports search using HGVS expressions for a wide variety of transcripts, genomic and protein sequences (from NCBI/EBI/LRG). The HGVS expression is a widely used a to describe a variant. More information about HGVS expressions is provided here
A descriptor is a representation of CACN that is derived from ClinVar preferred name. The derivation is to handle broader usecases, especially to handle imprecise start and end points.
Copy Number Variation - Overlaps
Use Copy Number Variation (CACN) descriptor to find overlapping alleles. By default, copy number information will be ignored. However you can uncheck the box to consider copy number info when finding matches.
Search the allele registry using Canonical Allele Identifiers. The Canonical Allele Identifiers are provided by the allele registry. A canonical allele identifier describes multiple equivalent representations of a given allele. The concept of Canonical Allele is described here by the ClinGen data model working group.
Search the allele registry using Canonical Allele Identifiers for copy number gain and loss. The Canonical Allele Identifiers are provided by the allele registry for copy number variations. See CAid based search for details.
ClinVar Variation Id
The ClinVar Variation ID is provided by ClinVar. The Variation ID is a unique identifier for the set of sequence changes that were interpreted (by ClinVar). More information about ClinVar Variation Id is provided here .
ClinVar RCV Id
The ClinVar RCV ID is provided by ClinVar as an accession number that describes a ClinVar submission. For more details visit here
Search the allele registry using dbSNP rs-Identifier. The rs identifier is an identifier for a location and type of variation. More information about dbSNP is available here
Search the allele registry using ExAC Identifier. The ExAC identifier for an allele is a simple representation of allele using hg19/GRCh37 reference sequence. The identifier is a concatenated string that describes chromosome, coordinate, reference, and alternate alleles (Chr-Start-Ref-Alt) with "-" delimiter. Please visit here for details.
Search the allele registry using gnomAD Identifier. The gnomAD identifier for an allele is a simple representation of allele using hg19/GRCh37 reference sequence. The identifier is a concatenated string that describes chromosome, coordinate, reference, and alternate alleles (Chr-Start-Ref-Alt) with "-" delimiter. Please visit here for details.
MyVariant Id (hg19)
Search the allele registry using MyVariant Identifier. MyVariant identifier for an allele is a simple representation of allele using hg19/GRCh37 reference sequence. The identifier is a variation of HGVS representation to describe variation. Instead of the reference sequence, the identifier uses chromosome name. Please visit here for details.
Search the allele registry using any of the identifiers: CAid, CACNid, ClinVar RCV and Variation Id, dbSNP Id.
HGNC Gene Symbol
Search the allele registry using HGNC approved gene symbols. For details of HGNC approved gene symbol visit here
Reference sequence and position
NM_000546.5 , 1 , 300
This search option requires more than one parameter. The first parameter is the reference sequence, the second is the start position and the third is the end position. This search will return alleles that start within the region on a reference sequence. This option may be helpful when searching for non-coding alleles within a region on a chromosome reference sequence.
Perform a range query using assembly, chromosome, start and end coordinates specifically on Copy Number Variation alleles. Optional paramaters include specifying minimum and maximum copies and whether to consider partial overlaps in the result. The matching alleles are ranked by Manhattan distance.
Do not have transcript/HGVS expression?
For a substitution with gene symbol, position, reference and alternate alleles known, please use this service:
This option provides a search box for entering multiple HGVS expressions (one per line). For alleles present in the allele registry, the search returns canonical allele identifiers. For valid alleles not present in the allele registry, the search results provide a button to register an allele.
Presentation by Erin Riggs that provides introduction and basic use cases of the ClinGen Allele Registry
Presentation by Prof. Aleks Milosavljevic at NHGRI-KOMP2 annual meeting
Presentation by Andrew Grant on How to map legacy variants to ClinGen Allele Registry identifers?
What is ClinGen Allele Registry?
The ClinGen Allele Registry provides unique variant identifiers both programmatically (via APIs) and via this search interface. If a variant is not present in the Registry, authorized users may register the variant and get an identifier within seconds. The variants are automatically mapped across known reference sequences and to identifiers from major variant databases.
The current content of the Registry is searchable using HGVS expressions representing nucleic acid or amino acid variants across more than 500,000 reference sequences (genome assemblies, transcripts, amino acid sequences). Alleles can be also queried by locus, gene and ClinVar or dbSNP identifiers. The registry regularly imports variants from ExAC, ClinVar and other databases.
To facilitate wide integration of the registry services with existing software and workflows for variant evaluation, e.g. Pathogenicity Calculator, all the functionalities of the registry are exposed via REST APIs. The API also allows for bulk query and registration of variants. Hundreds of variants saved as HGVS expressions can be processed as a batch in less than ten seconds. This will also be extended to VCF file in the near future. For instructions on how to register large batches of variants, follow the "API specification" provided on the home page.
To register new alleles in the Allele Registry, you will need a valid login and password. To create a login, please send an email request to firstname.lastname@example.org with a preferred login name.
Several systems use "chr1", "chr2", etc. as reference sequences in HGVS expressions (e.g. NC_000001.10:g.35366C>T). The allele registry does not recognize such HGVS expressions, as it does not provide information about organism and assembly. To convert such representations in an HGVS expressions that will be recognized by the allele registry (e.g. NC_000017.10:g.7578212G>A), user need to replace "chr1" with a specific reference sequence (e.g. NC_000017.10 and NC_000017.11 for GRCh37 and GRCh38 human assembly, respectively). Please use this table for mapping chromosome to reference sequence accession.
Allele registry supports GA4GH refget and VR standards
Allele registry APIs extended to support GA4GH-refget and GA4GH-VR standards
HGVS expression with MANE preferred sequence now shows amino-acid mapping
Added a link to video by ClinGen Education Team on landing page
Registry landing page updated to include release notes, citation, and, pointers to introductory videos
Source capitalization fixed for the user contributed link-outs
Allele Registry now shows the MANE preferred transcript at the interface.
Fixed an issue with protein consequences for inframe insertion/deletion
Search by dbSNP identifiers now returns more than one hit when applicable. In the previous version, it always used to return one hit.
Several transcripts were missing associations with genes (e.g. GJB2/GRIA4), as they were missing in the files provided by HGNC. As a result, information for gene was not shown in the variant centric page. Search by that HGNC symbol was not able to retrieve any hit. As a result of this update, they are now correctly associated and search retrieves appropriate results.
HGVS expression defined as a function of NCBI transcript sequence now provides a link out to recently published Variant Validator (Freeman et al., Human Mutation. 2018;39:61–68). The article and method claim to provide accurate HGVS expressions and correct it on the fly if not appropriate.
We have developed a semiautomatic pipeline for importing ClinVar update. With this, just a couple of commands will be sufficient to update ClinVar associations in ClinGen allele registry. As a result, we hope to be in sync with ClinVar updates.