Chromosome

Answer: 

A zoomed-in view of a chromosome is shown, including graphical displays of known and novel genes, percent of GC repeats, and variation density. Click on the chromosome to zoom in to Region in Detail. Add your own annotation to one chromosome, or a karyotype, using the custom data link and this view, or the Karyotype view.

Each species in Ensembl has a number of statistics for its genome assembly. These statistics are also found on species-specific home pages and are calculated as follows. Some counts may only be available from the species home page.

Base Pairs per chromosome

These are pre-calculated in order to speed up page display, and stored in the seq_region table of the core database. The number is based on the assembled end position of the last seq_region in each chromosome (from the AGP file), or if there is a terminal gap it is set to the assembled end location of that terminal gap.

For the haplotype chromosomes (c6_COX etc), although there is only haplotype-specific sequence for a small region of the chromosome, the length of the seq_region is set to the full length of the chromosome including the specific haplotype (eg. c6_COX is 170899992bp long).

Gene summaries

The number of gene types are listed below the chromosome, and are as follows:

Known Gene Count gives the number of known protein-coding genes that Ensembl has predicted on this chromosome. Known genes have been mapped by to species-specific protein sequences already available in the public sequence databases.

Novel Gene Count the number of novel genes that has predicted on this chromosome.Novel genes, although predicted on the basis of similarity to protein or cDNA sequences and/or ESTs, could not be mapped with confidence to existing entries for the same species.

Pseudogenes and non-coding (nc)RNA genes are annotated as several sub-classes of ncRNA genes. Counts per RNA gene class are available from this page.

Please note: Gene counts presented per chromosome on Ensembl Chromosome views are for only the areas shown. Gene counts for all chromosomes may not add up to the numbers presented for the whole genome on the species-specific home pages.This is due to extra-chromosomal, haplotypic sequences, which are annotated with genes but not necessarily displayed. The count differences are also due to the fact that pseudo-autosomal regions (PAR) on the human X and Y chromosomes count towards the whole-genome statistics only once.

SNP Count lists the number of variations that Ensembl has placed on this chromosome.

(Species Home Page) Base Pairs (whole assembly)

The total number of base pairs for the entire assembly is the sum ofall sequences in the dna table of the core database. It is available from the species-specific home page. This includes redundantregions such as haplotypic sequences and the pseudo-autosomal region (PAR) of the Y chromosome in human, and gaps in Drosophila melanogaster.See the assembly details of each species for more information.

(Species Home Page) Golden Path

The "golden path" is the length of the reference assembly. It consists of the sum of all top-level sequences in the seq_region table, omitting any redundant regions such as haplotypes and PARs.

To add user data to this display, click on the Custom data link at the left. Upload a file such as a gff file. If you have already uploaded data to another view, you can turn this track on by clicking on the configure this page link and selecting a track in the User data menu.

Note: The display is customisable. The gene densities and variation histogram may be turned off using the Configure this page link.