A three-gene-based Type 1 diabetes diagnostic signature
Background: Type 1 diabetes is a chronic autoimmune disease featured by insulin deprivation caused by
pancreatic β-cell loss, followed by hyperglycaemia.
Objective: Currently, there is no cure for this disease in clinical treatment, and patients have to accept a lifelong injection
of insulin. The exploration of potential diagnosis biomarkers through analysis of mass data by bioinformatic tools and
machine learning is important for Type 1 diabetes.
Methods: We collected two mRNA expression datasets of Type 1 diabetes peripheral blood samples from GEO, screened
out differentially expressed genes (DEGs) by R software, conducted GO and KEGG pathway enrichment using the DEGs.
And the STRING database and Cytoscape were used to build PPI network and predict hub genes. We constructed a Logistic
regression model by using the hub genes to assess sample type.
Results: Bioinformatic analysis of GEO dataset revealed 92 and 75 DEGs in GSE50098 and GSE9006 datasets, separately,
and 10 overlapping DEGs. PPI network of these 10 DEGs showed 7 hub genes, namely EGR1, LTF, CXCL1, TNFAIP6,
PGLYRP1, CHI3L1 and CAMP. We built a Logistic regression basing on these hub genes and optimized the model to 3
genes (LTF, CAMP and PGLYRP1) based Logistic model. The values of area under curve (AUC) of training set GSE50098
and testing set GSE9006 were 0.8452 and 0.8083, indicating the efficacy of this model.
Conclusion: Integrated bioinformatic analysis of gene expression in Type 1 diabetes and the effective Logistic regression
model built in our study may provide promising diagnostic methods for Type 1 diabetes.
Journal Title: Current Pharmaceutical Design