Prediction and Identification of Krüppel-Like Transcription Factors by Machine Learning Method

Author(s): Zhijun Liao, Xinrui Wang, Xingyong Chen, Quan Zou*

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Volume 20 , Issue 7 , 2017

Become EABM
Become Reviewer


Aim and Objective: The Krüppel-like factors (KLFs) are a family of containing Zn finger(ZF) motif transcription factors with 18 members in human genome, among them, KLF18 is predicted by bioinformatics. KLFs possess various physiological function involving in a number of cancers and other diseases. Here we perform a binary-class classification of KLFs and non-KLFs by machine learning methods.

Material and Method: The protein sequences of KLFs and non-KLFs were searched from UniProt and randomly separate them into training dataset(containing positive and negative sequences) and test dataset(containing only negative sequences), after extracting the 188-dimensional(188D) feature vectors we carry out category with four classifiers(GBDT, libSVM, RF, and k-NN). On the human KLFs, we further dig into the evolutionary relationship and motif distribution, and finally we analyze the conserved amino acid residue of three zinc fingers.

Results: The classifier model from training dataset were well constructed, and the highest specificity(Sp) was 99.83% from a library for support vector machine(libSVM) and all the correctly classified rates were over 70% for 10-fold cross-validation on test dataset. The 18 human KLFs can be further divided into 7 groups and the zinc finger domains were located at the carboxyl terminus, and many conserved amino acid residues including Cysteine and Histidine, and the span and interval between them were consistent in the three ZF domains.

Conclusion: Two classification models for KLFs prediction have been built by novel machine learning methods.

Keywords: Krüppel-like factor, binary-class classification, phylogenetic analysis, motif, a library for support vector machine, machine learning method.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2017
Page: [594 - 602]
Pages: 9
DOI: 10.2174/1386207320666170314094951
Price: $65

Article Metrics

PDF: 21
PRC: 4