The AgamP3.8 geneset contains an update for the ncRNAs using the current Ensembl Genomes pipeline. No modifications of protein-coding loci were undertaken. For more details about the changes made from gene set 3.7 to 3.8 please read below.

There were 650 RNA genes in the AgamP3.7 gene set. Most of these (556) were predictions from Rfam, RNAmmer, and tRNAScan (Ensembl Genomes standard ncRNA prediction pipeline), but they hadn't been updated for years. For AgamP3.8, those same three programs were run, and the results represent the bulk of the current ncRNA annotation. In many cases the same genes were predicted as before, sometimes with very slight modifications to the gene bounds; a smattering of genes were lost, and some new ones added.

In summary:

  • 479 unchanged genes
  • 54 changed genes
  • 125 new genes
  • 23 deleted genes

In cases where genes changed, were added, or deleted, the stable IDs are tracked in the same manner as protein-coding genes. That is, on a gene or transcript page in the genome browser, the "ID History" link at the bottom of the left-hand menu shows you the history of changes. You can also map IDs between releases, with the "ID History converter" tool available through the "Browser Tools" link at the top of any genome browser page (or following this link).

If needed for your research, follow this link and download the two text files with all the ncRNA ID mappings.

In addition to the previous results of the ncRNA pipeline, there were 94 miRNA genes derived from mirBase in AgamP3.7; these were not predicted by the recent run of the ncRNA pipeline (Rfam should incorporate all miRNA genes). However, some of these miRNAs are annotated with additional evidence, so they were all retained. The prediction method displayed on a gene page enables you to distinguish between these genes and those that result from the standard ncRNA pipeline:

  • Prediction Method: MicroRNAs derived from miRBase; follow this link for an example.
  • Prediction Method: ncRNA genes are predicted using a combination of methods depending on their type. tRNAs are predicted using tRNAScan-SE, rRNAs using RNAmmer, and for all other types, using covariance models and sequences from RFAM; follow this link for an example.

In the final AgamP3.8 gene set, then, there are (556+125-23+94=) 752 RNA genes (plus 5 pseudogenes for a total of 757 genes).

Note: The is VectorBase group annotating ncRNAs in the frame of 16 Anopheline genomes project. Later this year, probably for June, there will be an update set of RNA genes. Except Rfam, RNAmmer and tRNAScan this other pipeline includes miRPara, HHMMir and RNAz. The plan is to reconcile the new dataset with the one currently available.


Genes Protein-coding Other
13,567 12,810 757


Transcripts Protein-coding Other
15,424 14,667 757
Release date: 
25 Feb 2014


