The A. gambiae S form genome was sequenced by the J. Craig Venter Institute (Lawniczak et al. 2010, PMID: 20966253). DNA samples derived from whole mosquitoes were provided by the University of Notre Dame and MR4. BAC libraries were provided by the Clemson University Genomics Institute (CUGI), and are available through CUGI or MR4. The number of traces was ~2.714 million S sequence reads, deposited in the NCBI Trace Archives. Eighty two percent of S traces were from plasmid libraries (38% from 3-4kb inserts, 44% from 10-12kb inserts) and 16% from 40kb fosmids and 3% from 90kb BACs. Based on the source DNA of these libraries, 97% of sequence data were generated from heterogametic males, resulting in lower X-chromosome sequence coverage relative to autosomes.

Whole genome shotgun (WGS) sequences were assembled de novo for each genome by both sequencing centers: at WUGSC using the PCAP assembler (S3), and at JCVI using the Celera assembler ( WUGSC assemblies based on the original PCAP algorithm were nearly twice the expected ~260 Mb size (S4). This outcome reflected considerable numbers of high quality base discrepancies (polymorphisms), owing to relatively high allelic variation in the non-isogenic genome samples. Although a modification of PCAP (Pcap.rep.poly) resulted in smaller assembled genome sizes, the Celera assembler algorithms specifically developed to accommodate heterozygosity gave improved assemblies. By mutual agreement, the JCVI assemblies (available via GenBank accessions ABKP00000000 and ABKQ00000000) and served as the basis for the VectorBase genome browsers.

Genome Size (bp): 
Scaffold count: 
13 042
Release date: 
Friday, November 5, 2010


The A. gambiae S Pimperena colony was established from blood-fed adult females collected in the village of Pimperena, Mali in November 2005. Approximately 5 isofemale families molecularly identified as A. gambiae S form were used to establish the colony.

Assembly Specific Downloads

 Downloads for this assembly