Scientists have sequenced the massive and complex genome of sugarcane, which may lead to the development of hardier and more productive crops.
For centuries, sugarcane has supplied humans with alcohol, biofuel, building and weaving materials, and the world's most relied-upon source of sugar, said researchers at the University of Illinois in the US.
Producing the comprehensive sequence required a concerted effort by over 100 scientists from 16 institutions who published their finding in the journal Nature Genetics.
The sugarcane grown by most farmers is a hybrid of two species: Saccharum officinarum, which grows large plants with high sugar content, and Saccharum spontaneum, whose lesser size and sweetness are offset by increased disease resistance and tolerance of environmental stress.
Lacking a complete genome sequence, plant breeders have made high-yielding, robust strains through generations of crossing and selection, but this is an arduous process relying on time and luck, researchers said.
"Sugarcane is the fifth most valuable crop, and the lack of a reference genome hindered genomic research and molecular breeding for sugarcane improvement," said Ray Ming, a professor at the University of Illinois.
Sometime during the evolutionary history of sugarcane, its genome had been duplicated twice, resulting in four slightly different versions of each pair of chromosomes all crammed into the same nucleus together, researchers said.
These events not only quadrupled the size of the genome, and therefore the sheer volume of DNA sequence, they also made highly similar sequences from the genome-wide duplication much more difficult to assemble into distinct chromosomes, they said.
Genomic DNA is typically sequenced in small, overlapping fragments, and the sequence data from those fragments become overlapping pieces of an enormous linear puzzle.
The team used a technique called high-throughput chromatin conformation capture or Hi-C.
This method allows researchers to discover what parts of the long, tangled strands of chromosomal DNA lie in contact with one another inside the cell.
When analysed using a customised algorithm called ALLHIC, the resulting data provided a rough map of which sections of sequence most likely belonged to which chromosome.
"The biggest surprise was that by combining long sequence reads and the Hi-C physical map, we assembled an autotetraploid (quadrupled) genome into 32 chromosomes and realised our goal of allele-specific annotation among homologous chromosomes," Ming said.
The researchers now knew which gene sequences belonged to each of the four variations on the original, pre-duplication genome -- a much higher level of detail than they expected to attain.
With this information, the researchers could form better hypotheses about the mysteries of the sugarcane genome's evolutionary history.
Through comparison with the genomes of related species, researchers knew that at some point the number of unique chromosomes had dropped from 10 to eight.
To the team's surprise, the new sequence data revealed that two different chromosomes had split apart, and all four halves had then fused to different existing chromosomes, a more complex set of events than the one they hypothesised.
Along with these large physical rearrangements within the genome come changes to the genes in the affected regions.
Ming and his colleagues found that the large chunks of the chromosome that had been moved to new locations contained many more genes that help plants resist disease than were found in other locations.
"It resolved a mystery why S spontaneum is such a superior source of disease resistance and stress tolerance genes," Ming said.
"This discovery will accelerate mining effective alleles of disease resistance genes that have incorporated into elite modern sugarcane hybrid cultivars, and subsequently the implement of molecular breeding (of sugarcane)," he said.
(With inputs from agencies.)