Background: Production of biofuels from lignocellulosic crop biomass is an alternative to reduce greenhouse gas emissions. The biofuel production involves collecting biomass, breaking down cell wall components followed by the conversion of sugars to ethanol. The lingo-cellulosic biomass comprises 40-50% cellulose, 20-30% hemicellulose, and 10-25% lignin. Sorghum is a widely adapted energy crop for biofuel production. Biomass with low lignin, high cellulose, and high hemicellulose contents are exploited to attain maximum biofuel production efficiency. Resistance to lodging, pest, disease, and abiotic stresses related to cell wall components is well documented, and quantitative trait loci were identified to understand these traits' genetic correlation. Selection for reduced lignin and increased cellulose content in stover can increase the ethanol yield. The Genome-Wide Association Studies (GWAS) is a complementary approach to evaluating the marker and phenotype associations among large diversity panels. Single nucleotide polymorphisms were scanned to identify loci associated with the traits of interest. In this study, the GWAS was performed on 245 sorghum minicore genotypes to analyze agronomic traits (days to 50%flowering, fresh biomass yield, dry biomass yield) and cell wall components (cellulose, hemicellulose, and lignin). Further, in-silico validation of the candidate genes was performed in a global gene expression data from large-scale RNA sequencing studies in sorghum available in the NCBI GEO database was used.
Objective: The objectives of this study are to evaluate native variations in biofuel related agronomic traits and stalk cell wall components and to identify significant SNPs or loci related to the cell wall components.
Methods: In this article, an association mapping panel, comprising of 245 sorghum minicore germplasm accessions, was evaluated during two post rainy seasons of 2013 and 2014, and observations were recorded on the whole plot- for days to 50% flowering, fresh biomass yield (tha-1), and dry biomass yield (tha-1). The biomass of sun-dried plants from both seasons was collected separately, chopped, dried, and ground to powder. The cellulose, hemicellulose, and lignin contents were determined in the powdered. The content of each of these three components in sorghum was expressed in percent of dry matter. The data on agronomic traits and composition analysis was subjected to Analysis of Variance. For the current study, we remapped the raw GBS data with the sorghum assembly version v3.1. A total of 27,589 SNPs were obtained with a minor allele frequency (MAF) >1% and missing data <50%. The GWAS was performed in a single minicore population using FarmCPU, in R software. The syntactic positions of the identified significant SNPs between sorghum and other model crop species viz., maize, switchgrass, and Arabidopsis were represented using CIRCOS software for traits viz., dry biomass yield, cellulose, hemicellulose, and lignin. The transcriptome dataset from where sorghum gene atlas studies of grain, sweet, and bioenergy sorghums are available through NCBI's Gene Expression Omnibus (GEO) under accession number GSE49879, was used to cross-validate the identified SNPs for cellulose, hemicellulose, and lignin through GWAS.
Results: High broad-sense heritability was exhibited for all the traits in individual seasons along with significant genotype × environment interaction across seasons except lignin. Association mapping with a P < 1×10−4 revealed genomic regions associated with the- (i) agronomic traits (days to 50% flowering, fresh and dry biomass), and (ii) biochemical traits (cellulose, hemicellulose, and lignin) associated with biofuels production, in individual seasons. Twelve significant SNPs for flowering time, 30 fresh biomass yields, and 24 for dry biomass yield, 25 for cellulose, 7 for hemicellulose, and 21 for lignin were identified. CIRCOS plot was constructed to identify and analyze similarities and differences while comparing the sorghum genome with different crops. For cellulose high similarity of >80% was observed for all sorghum gene sequences with the maize homologs. The overall similarity of sorghum homologs with foxtail millet was >65%, for Arabidopsis from 30.6% to 48.6%, and rice from 28.2% to 92.8%. SNPs for hemicellulose displayed maximum similarity to foxtail millet followed by maize. The sequence similarity of lignin SNPs in sorghum was highest with the maize genome followed by Arabidopsis. Both rice and foxtail millet showed >55% similarity to the sorghum genome.
Conclusion: This study reports large variability for agronomic and biofuel traits in the sorghum minicore collection with high heritability. The genetic architecture of cell wall components using the GWAS approach was studied and candidate genes for each component were annotated. These results give a better understanding of the genetic basis of the sorghum cell wall composition. The association analysis identified regions of the genome that could be targeted to enhance the quality of biomass and yield along with the desired composition promoting breeding efficiency for enhanced biofuel yield.