Genome Analysis Of Vectorial Capacity In Major Anopheles Vectors Of Malaria Parasites

Malaria causes an estimated 300-500 million cases and kills three million children annually. Despite
considerable emphasis on the development and deployment of control methods, the disease remains a
significant threat. Mosquito control in particular has suffered from the development of resistance to
insecticides. Of the ~500 anopheline species, only two dozen are important vectors of human malaria
parasites. Why some members of the same anopheline species transmit malaria parasites while others do not,
or are less efficient, is of intense interest to vector biologists. Developing a better understanding of this
‘vectorial capacity’ may enable its eventual manipulation in order to reduce disease burden.

This document proposes sequencing of 13 anopheline vector genomes (average size ~250 Mb), representing
26 billion base pairs, to complement and facilitate comparative analysis with the three other sequenced
anophelines, Anopheles gambiae PEST, M and S forms. Using An. gambiae as the anchor and adopting a
‘ladder-and-constellation’ approach inspired by the successful 12 Drosophila genomes project, we propose
deep sampling of species belonging to the An. gambiae sibling species complex (Tier 1), followed by sampling
at increasing evolutionary distances within the three main Anopheles subgenera (Tiers 2 and 3), with
particular emphasis on subgenus Cellia that contains An. gambiae. In addition to genomic sequencing, we
propose EST sequencing for each species in support of genome annotation.

Generating genome sequence data using this scheme will allow inferences about both rapid and gradual
evolutionary changes relevant to vector ability. This is necessary to determine, for example, the underlying
genetic determinants of feeding preference, since these are unlikely to be conserved across large evolutionary
distances and specialization on human blood feeding is likely to have been a very recent evolutionary event. It
will also enable the development of powerful genomic tools that are the necessary foundation for identifying
new approaches to the control of vectors whose biology is poorly understood, in contrast to genetic and
evolutionary models such as Drosophila.

Although vector status was of prime importance in the selection of sequencing targets, choice was constrained by availability of colonies housed by the Malaria Research and Reference Reagent Resource Center (MR4), which will be a project and community resource for DNA, RNA and live mosquitoes from colonies. Since the initial whitepaper was approved, two additional species have been added to the project following the acquisition of available sequencing template: An. melas and An. christyi.

In addition to the approved goals of (1) high quality reference genome assemblies of each species and (2) transcriptome sequencing in support of gene annotation, a limited amount of SNP discovery based on wild specimens will augment these genome projects.  Illumina-based genome sequencing and assembly, RNAseq and SNP discovery will be managed by the Broad Institute (under the direction of Daniel Neafsey). Genome annotation will be based on contributions by the Broad Institute, VectorBase, and members of the scientific community, whose input is encouraged.  Once available, assemblies and gene models will be made accessible to the public. Production sequencing is beginning in spring 2011 and initial genome assemblies and annotations are expected to be available by late summer or early fall of 2012.

Overall coordination of the project will be handled by Nora Besansky (nbesansk at and a Coordinating Committee (AGCC)1. In addition, as the need arises, individuals have agreed to serve as community liaisons2 between focal groups and the AGCC.

This project was inspired by very ambitious goals: improved understanding of vectorial capacity, and the application of that understanding toward reducing malaria disease burden.  Accordingly, its success depends upon community input at all levels.  Please contact Nora Besansky or any members of the AGCC with questions, comments or suggestions. The AGCC is presently coordinating a community analysis strategy for the data to be generated under this whitepaper. The AGCC plan is to publish two flagship papers broadly covering the evolutionary genomics of the An. gambiae species group and the larger set of species in relation to malaria.

The AGCC will also help coordinate community efforts to more deeply explore topics of special interest, for publication as focal papers. Parties interested in contributing to the initial analysis of these data, whether for the flagship manuscripts or the focal manuscripts or both, should send a brief email to Nora Besansky (nbesansk at describing the analysis topic of interest. To facilitate coordination, transparency, and maximal community engagement, the AGCC will coordinate a project wiki by late summer 2011 with available information about analysis efforts and who is leading them.



(Gambiae complex)

Species Classification Reference Assembly (Geographic Source) SNP Discovery (Geographic Source)
1. An. arabiensis Series Pyretophorus
(Gambiae complex)
Dongola (Sudan)

[Isofemale subcolony, 2Rb/b homokaryotype]

Burkina Faso, Cameroon, Kenya
2. An. quadriannulatus A (Gambiae complex) SANGWE (South Africa)

[Isofemale subcolony, heterokaryotype X+f/f]

3. An. merus (Gambiae complex) MAF (South Africa)

[Isofemale subcolony]

Kenya, South Africa
4. An. melas (Gambiae complex) NO COLONY

[sequencing from wild collected from Cameroon]

Bioko, Equatorial Guinea;
Ipono, Cameroon; Ballingho, The Gambia
5. An. christyi   NO COLONY

[Sequencing from wild collected from Kenya]

6. An. epiroticus (Sundaicus complex) NO COLONY

[Sequencing from wild collected from Vietnam]

Vietnam (resistance)
7. An. stephensi Series Neocellia SDA-500 (Pakistan)

[Isofemale subcolony]

8. An. maculatus (sp. B) (Maculatus Subgroup) COLONY NOT AT MR4 (Kuala Lumpur)

[sequencing from preserved females]

9. An. funestus Series Myzomyia

(Funestus Subgroup)

FUMOZ (Mozambique) Burkina Faso (Folonzo & Kiribina)
10. An. minimus s.s.(sp. A) (Minimus complex) MINIMUS1 (Thailand) Thailand (cline)
11. An. culicifacies A (Culicifacies Subgroup) NO COLONY

[sequencing from wild collected from Iran]

Iran (species A, species D, species A-like)
12. An. farauti 1 Series Neomyzomyia FAR1 (Papua New Guinea)

[Isofemale subcolony]

13. An. dirus s.s. (sp. A) (Dirus complex) WRAIR2 (Thailand)

[Isofemale subcolony]

Thailand (species A/D)
14. An. atroparvus Subgenus Anopheles EBRO (Spain)

[Isofemale subcolony]

15. An. albimanus Subgenus Nyssorhynchus STECLA (El Salvador)

[Isofemale subcolony]

16. An. sinensis (Hyrcanus group) CHONGQING (China)

[Isofemale subcolony]


Samples are in sequencing; check back later for data. 

1Anopheles Genomes Cluster Coordinating Committee (AGCC):

George Christophides: g.christophides at; Frank Collins: frank at; Scott Emrich: semrich at; William Gelbart: gelbart at; Matthew Hahn: mwh at; Paul Howell: bsr7 at; Fotis Kafatos: f.kafatos at; Daniel Lawson: Lawson at; Marc Muskavitch: marc.muskavitch at; Daniel Neafsey: neafsey at; Nora Besansky: nbesansk at

2Community Liaisons:

An. gambiae and sibling species: Dr. Nora Besansky (nbesansk at; An. funestus: Drs. N'Fale Sagnon (n.fale.cnlp at and W. Guelbeogo  (guelbeogo.cnrfp at; An. stephensi: Drs. Igor Sharakhov (igor at and Jake Tu (jaketu at; An. farauti: Dr. Nigel Beebe (n.beebe at; An. dirus, An. minimus: Dr. Catherine Walton (Catherine.Walton at; An. atroparvus: Drs. Maria Sharakhova (An. albimanus: Dr. Martinez Barnetche (jmbarnet at