Identification of the Diagnostic Signature of Sepsis Based on Bioinformatic Analysis of Gene Expression and Machine Learning

(E-pub Ahead of Print)

Author(s): Qian Zhao, Ning Xu, Hui Guo, Jianguo Li*

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Become EABM
Become Reviewer
Call for Editor


Background: Sepsis is a life-threatening disease caused by the dysregulated host response to the infection, and being the major cause of death to patients in intensive care unit (ICU).

Objective: Early diagnosis of sepsis could significantly reduce in-hospital mortality. Though generated from infection, the development of sepsis follows its own psychological process and disciplines, alters with gender, health status and other factors. Hence, the analysis of mass data by bioinformatic tools and machine learning is a promising method for exploring early diagnosis manners.

Methods: We collected miRNA and mRNA expression data of sepsis blood samples from Gene Expression Omnibus (GEO) and ArrayExpress databases, screened out differentially expressed genes (DEGs) by R software, predicted miRNA targets on TargetScanHuman and miRTarBase websites, conducted Gene Ontology (GO) term and KEGG pathway enrichment based on overlapping DEGs. The STRING database and Cytoscape were used to build protein-protein interaction (PPI) network and predict hub genes. Then we constructed a Random Forest model by using the hub genes to assess sample type.

Results: Bioinformatic analysis of GEO dataset revealed 46 overlapping DEGs in sepsis. The PPI network analysis identified five hub genes, SOCS3, KBTBD6, FBXL5, FEM1C and WSB1. Random Forest model based on these five hub genes was used to assess GSE95233 and GSE95233 datasets, and the area under curve (AUC) of ROC are 0.900 and 0.7988, respectively, which confirmed the efficacy of this model.

Conclusion: The integrated analysis of gene expression in sepsis and the effective Random Forest model built in this study may provide promising diagnostic methods for sepsis.

Keywords: Sepsis, Bioinformatic analysis, Random Forest, Diagnosis, GO, KEGG

Rights & PermissionsPrintExport Cite as

Article Details

(E-pub Ahead of Print)
DOI: 10.2174/1386207323666201204130031
Price: $95

Article Metrics

PDF: 57