How are gene descriptions propagated between species?


Some of the species in VectorBase have been annotated more extensively than others, and it is useful to propagate gene descriptions to closely related species. Gene descriptions (but not gene names) are propagated based on orthology.

If there is one-to-one or one-to-many orthology between a gene in a source (i.e. well annotated) species and a target species, then the description is propagated from the source to the target if the following conditions are met:

  • a description in the source gene exists, and does not contain the words 'hypothetical' or 'putative'
  • no existing name or description in the target gene
  • >30% amino acid sequence identity
  • an alignment that covers >66% of both genes' lengths

When the description is propagated to the target gene it retains the source description's provenance, and information is added about the source species and gene stable ID. If the description ends in a digit, this usually indicates a species-specific element of the annotation, and is removed during propagation.

Descriptions are propagated between the following species:

  • Aedes aegypti to Aedes albopictus
  • Anopheles gambiae to the other Anophelines
  • Glossina morsitans to the other Glossinidae, Musca domestica, Stomoxys calcitrans
  • Drosophila melanogaster to Glossinidae, Musca domestica, Stomoxys calcitrans