What the gene set counts mean?


This a gene set table (click on the image to enlarge):

Gene set table


Gene set statistics are split into two categories: Gene and Transcript providing quick summary counts for the content of a specific gene set. The prevelance of alternative splicing in eukaryotic genomes through which a single gene can encode for multiple transcripts means that the Gene and Transcript counts are likely to differ. Further, each of these categories are split between Protein_coding (those Genes/Transcripts which via mRNAs are translated into polypeptides) and Other which contains non-coding RNA genes and pseudogenes.

The 'Other' category is everything that is not an mRNA. These include (but are not limited to) the types listed below:

Note that there are many more RFAM families but whenever they are classified as motifs (e.g. a SECIS element motif, RF00031), they are filtered out by VectorBase/Ensembl ncRNA gene prediction pipeline. For all non-coding RNA, except tRNA and rRNA genes, models are predicted by aligning a genomic sequence against Rfam sequences. Rfam makes its annotations available for editing in the online encyclopedia Wikipedia.

