In September of 2008, the National Human Genome Research Institute (NHGRI) and the National Institute of Allergy and Infectious Diseases (NIAID) of the U.S. National Institutes of Health approved funding for sequencing and assembly of the genomes and transcriptomes of thirteen Anopheles species, as described in the white paper "Genome Analysis Of Vectorial Capacity In Major Anopheles Vectors Of Malaria Parasites" This project was inspired by very ambitious goals: improved understanding of vectorial capacity, and the application of that understanding toward reducing malaria disease burden.
Since the initial whitepaper was approved, three additional species have been added to the project following the acquisition of available sequencing template: An. melas, An. christyi, and An. sinensis. An updated list of species and colonies that are part of the project is available in the table below. Detailed progress updates on the completion of resources for each species are available at a VectorBase wiki. The Malaria Research and Reference Reagent Resource Center (MR4) serves as the primary project and community resource for DNA, RNA and live mosquitoes from the colonies it maintains. Based on analysis of the initial assemblies we have produced, we anticipate that Illumina-based assemblies of those species for which colonies are available (as indicated in the table below) will be comparable in quality and completeness to the original Sanger sequencing-based PEST assembly for An. gambiae. Species sequenced from wild caught material may be expected to exhibit shorter scaffold lengths due to the difficulty of obtaining high molecular weight DNA from such samples.
In addition to the approved goals of high quality reference genome assemblies of each species and transcriptome sequencing in support of gene annotation, both of which depend upon availability of mosquito colonies, a limited amount of SNP discovery based on wild specimens will augment these genome projects. Illumina-based genome sequencing and assembly, RNAseq and SNP discovery will be managed by the Broad Institute (under the direction of Daniel Neafsey). Genome annotation will be based on contributions by the Broad Institute, VectorBase, and Robert Waterhouse (MIT/Geneva). Overall coordination of the project is handled by Nora Besansky (nbesansk at nd dot edu) and a Coordinating Committee (AGCC). In addition, as the need arises, individuals have agreed to serve as community liaisons between focal groups and the AGCC. Please contact Nora Besansky, other members of the AGCC, or community liasons with questions, comments or suggestions.
Production sequencing began in spring 2011. Initial genome assemblies for 9 species are currently available, with assemblies for the remaining seven species expected by September 2013. A roadmap for completion of genomic resources is available (see roadmap.pdf in Downloads). Once transcriptome and genome assemblies and gene models are 'frozen' and released (available for download from this website and through vectorbase.org), analysis will commence.
This project will culminate with the publication of two flagship papers intended to provide overviews of evolutionary genomic analyses across the more than 90 MY spanned by all 17 available anopheline genomes (including the previously sequenced An. gambiae), and the much more recent diversifications within the An. gambiae sibling species complex (see phylogeny.pdf in Downloads). SNP data have been released for 50 genomes (see anopheles_snp.tar.gz in Downloads – 70+GB file).
Biological themes that will be investigated include:
- Molecular Evolution
- Circadian rhythm
- Insecticide resistance/metabolism
- Repetitive elements
- The sialome
- Inversions and chromosomal architecture
- Blood/sugar digestion
- Transcriptional regulation
The AGCC will coordinate a community analysis strategy for any parties interested in contributing to the flagship manuscripts and/or additional focal manuscripts. Please send a brief email to Nora Besansky (nbesansk at nd dot edu) or Daniel Neafsey (neafsey at broadinstitute dot org) describing the analysis topic of interest. Individuals wishing to perform genome-wide analyses on these genomic resources independent of the community publications are advised to abide by the recommendations for data users established by the 2009 Toronto International Data Release Workshop and please contact us so that we can coordinate publications.
To facilitate coordination, transparency, and maximal community engagement, the AGCC will assemble a project wiki by summer 2013 with available information about analysis efforts.
|Species||Subgenus||Informal Category||Reference Assembly
|An. arabiensis||Cellia||Series Pyretophorus
[Isofemale subcolony, 2Rb/b homokaryotype]
|Burkina Faso, Cameroon, Kenya|
|An. quadriannulatus||Cellia||(gambiae complex)||SANGWE
[Isofemale subcolony, heterokaryotype X+f/f]
|An. merus||Cellia||(gambiae complex)||MAF
|Kenya, South Africa|
|An. melas||Cellia||(gambiae complex)||NO COLONY
[Sequencing from wild collected from Cameroon]
|Bioko, Equatorial Guinea;
Ballingho, The Gambia
|An. christyi||Cellia||NO COLONY
[Sequencing from wild collected from Kenya]
|An. epiroticus||Cellia||(sundaicus complex)||NO COLONY
[Sequencing from wild collected from Vietnam]
|An. stephensi||Cellia||Series Neocellia||SDA-500
|An. maculatus (sp. B)||Cellia||(maculatus subgroup)||COLONY Not AT MR4
[sequencing from preserved females]
|An. funestus||Cellia||Series Myzomyia
|Burkina Faso (Folonzo & Kiribina)|
|An. minimus s.s. (sp. A)||Cellia||(minimus complex)||MINIMUS1
|An. culicifacies A||Cellia||(culicifacies subgroup)||NO COLONY
[sequencing from wild collected from Iran]
|Iran (species A, species D, species A-like)|
|An. farauti 1||Cellia||Series Neomyzomyia
(Papua New Guinea)
|PNG, Australia, Solomon Islands|
|An. dirus s.s. (sp. A)||Cellia||(dirus complex)||WRAIR2
|Thailand (species A/D)|
|An. sinensis||Anopheles||Hyrcanus Group||SINENSIS
Anopheles Genomes Cluster Coordinating Committee (AGCC)
- George Christophides: g.christophides at imperial dot ac dot uk
- Frank Collins: frank at nd dot edu
- Scott Emrich: semrich at nd dot edu
- Michael Fontaine: Michael.fontaine at nd dot edu
- William Gelbart: gelbart at morgan dot harvard dot edu
- Matthew Hahn: mwh at indiana dot edu
- Paul Howell: bsr7 at cdc dot gov
- Fotis Kafatos: f.kafatos at imperial dot ac dot uk
- Daniel Lawson: Lawson at ebi dot ac dot uk
- Marc Muskavitch: marc.muskavitch at bc dot edu
- Rob Waterhouse: robert.waterhouse at gmail dot com
- Daniel Neafsey: neafsey at broadinstitute dot org
- Nora Besansky: nbesansk at nd dot edu.
- An. gambiae
- Dr. Nora Besansky (nbesansk at nd dot edu)
- An. funestus
- Drs. N'Fale Sagnon (n.fale.cnlp at fasonet dot bf) and W. Guelbeogo (guelbeogo.cnrfp at fasonet dot bf)
- An. stephensi
- Drs. Igor Sharakhov (igor at vt dot edu) and Jake Tu (jaketu at vt dot edu)
- An. farauti
- Dr. Nigel Beebe (n.beebe at uq dot edu dot au)
- An. dirus, An. minimus
- Dr. Catherine Walton (Catherine.Walton at manchester dot ac dot uk)
- An. atroparvus
- Drs. Maria Sharakhova (msharakh at vt dot edu) and Igor Sharakhov (igor at vt dot edu)
- An. albimanus
- Dr. Martinez Barnetche (jmbarnet at correo dot insp dot mx)
Editor note: We're working on integrating SNP data into Olive's data navigation. For the time being, here's an index of the SNP data available in the archive attached to this project.
Project Data can also be found at NCBI
Please cite all data relating to this initiative (including individual genes and genomes) as:
"Anopheles 15 Genomes initiative, Broad Institute (broadinstitute.org)"