The genome assembly decision support system

Assembling plant genomes is difficult. The genomes may be large, repetitive, and polyploid. These challenges are typically met by using combinations of technologies and custom processing pipelines, with varying results. To date, there is no single best solution for designing plant genome sequencing and assembly projects. Results are sometimes quite satisfactory and useful, but the process may be time-consuming, error prone and demands a high level of understanding.

Planning a plant genome sequencing and assembly project is challenging because of the great variety of genome compositions, strategies and tools available. The best approach in the current scenario, where only particular cases of the assembly problem have been solved, is to check for similar cases and research how different strategies have performed. However, plant genomes are rarely the focus of assembly tool development, and as such, few test cases are available.

We have created a repository of example plant genome assembly projects with a complete description of how they have been processed, and a set of guidelines on common best practices (AssemblyKB). This constitutes a key milestone on the road to simplifying assembly choices for plant genomes. The repository provides information on how to analyse and assemble the data, and also explores data generation and quality, including whether examples will be valid for any given genome.

URL: