Genome-scale resequencing is now widespread in plants, to identify variation in natural populations and crop stocks. This not only provides insight into evolutionary mechanisms and functional processes, but additionally allows for the rapid statistical correlation of variant loci with quantitative traits and the identification of probable causative genes.
The transPLANT variation archive has been developed to provide a system for the persistent storage and analysis of variation data from plant species. We are accepting submissions of variant data in VCF format on known reference sequences, i.e. sequences present in the databases of the INSDC: ENA, GenBank, or DDBJ.
Submitted data will be:
- Persistently archived via the ENA.
- Uniquely accessioned per submission and at each variant locus.
Data from different submissions referencing the same locus will be assigned a common, non-redundant identifier in the context of the given reference sequence.
- Propagated to new versions of the reference sequence.
When a new version of a reference sequence is submitted to the INSDC, all accessioned variants will be mapped onto the new reference sequence (where possible).
- Made available for download.
Both the original (and accessioned) submission and any subsequent updates will be made available, e.g. here.
- Made available in the Ensembl Plants interface.
Where the submitted data is located on a reference sequence that is used in the Ensembl Plants database, it will be visible there.
This work is being carried out in the context of the development of the European Variation Archive, a new resource to organize and process genomic variation data for all species.
The archive is now open for submission! Submit a VCF...