Chaerkady et al. (2011) Anopheles gambiae mass spec. peptides


Description of data

Annotation of these sequenced genomes is a complex task, especially in case of eukaryotic genomes. Along with the use of prediction programs, manual curation is required to achieve accurate gene annotation. Although combination of prediction strategies are used for genome annotation, prediction of small genes, intron-exon boundaries and alternative transcripts remain challenging. We present high-resolution mass spectrometry based proteomics as a complementary approach for refining genome annotation. In this 'proteogenomic' analysis, peptide sequences obtained from mass spectrometry based shot-gun sequencing are mapped back to the genome and used for gene prediction in the way similar to using cDNA/ EST sequences.

For the mass spectrometry based analysis of proteome, proteins are digested into peptides and the sequence is deduced from fragment ion spectra derived from individual peptides. We carried out a comprehensive mass spectrometry analysis of proteins isolated from larvae, pupae and various adult mosquito tissues namely, midgut, salivary gland, ovary, malpighian tubules, testis, male accessory organs, head and viscera. All the analyses were carried out on Fourier transform mass spectrometer using high resolution MS and MS/MS parameter settings. We analyzed the mass spectrometry derived data using Mascot search algorithm with data deconvolution against protein database and six frame translation of genome of Anopheles gambiae. Peptides which were mapped to part of the genome where no CDS is annotated were categorized under different region such as. intergenic region, intron and UTR, were used to refine the gene annotation. Novel splice isoforms were identified using exon junction peptide database for hypothetical alternative transcripts. This large scale proteogenomics analysis led to the identification of many novel genes, novel splice isoforms as well as corrected and validated gene annotations in Anopheles gambiae genome.


A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry.
Chaerkady R, Kelkar DS, Muthusamy B, Kandasamy K, Dwivedi SB, Sahasrabuddhe NA, Kim MS, Renuse S, Pinto SM, Sharma R, Pawar H, Sekhar NR, Mohanty AK, Getnet D, Yang Y, Zhong J, Dash AP, MacCallum RM, Delanghe B, Mlambo G, Kumar A, Prasad TS, Okulate M, Kumar N, Pandey A.
Genome Res. 2011 Sep 30.