Recent Advances on Prediction of Human Papillomaviruses Risk Types

Author(s): Yuhua Yao*, Huimin Xu, Manzhi Li, Zhaohui Qi, Bo Liao.

Journal Name: Current Drug Metabolism

Volume 20 , Issue 3 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Background: Some studies have shown that Human Papillomavirus (HPV) is strongly associated with cervical cancer. As we all know, cervical cancer still remains the fourth most common cancer, affecting women worldwide. Thus, it is both challenging and essential to detect risk types of human papillomaviruses.

Methods: In order to discriminate whether HPV type is highly risky or not, many epidemiological and experimental methods have been proposed recently. For HPV risk type prediction, there also have been a few computational studies which are all based on Machine Learning (ML) techniques, but adopt different feature extraction methods. Therefore, we conclude and discuss several classical approaches which have got a better result for the risk type prediction of HPV.

Results: This review summarizes the common methods to detect human papillomavirus. The main methods are sequence- derived features, text-based classification, gap-kernel method, ensemble SVM, Word statistical model, position- specific statistical model and mismatch kernel method (SVM). Among these methods, position-specific statistical model get a relatively high accuracy rate (accuracy=97.18%). Word statistical model is also a novel approach, which extracted the information of HPV from the protein “sequence space” with word statistical model to predict high-risk types of HPVs (accuracy=95.59%). These methods could potentially be used to improve prediction of highrisk types of HPVs.

Conclusion: From the prediction accuracy, we get that the classification results are more accurate by establishing mathematical models. Thus, adopting mathematical methods to predict risk type of HPV will be the main goal of research in the future.

Keywords: Human Papillomavirus (HPV), computational methods, classification of risk types, machine learning algorithms, Position- Specific Statistical Model, Statistical Model of Protein ‘‘Sequence Space’’.

[1]
Zur, H.H. Papillomaviruses and cancer: From basic studies to clinical application. Nat. Rev. Cancer, 2002, 2, 342-350.
[2]
Haedicke, J.; Iftner, T. Human papillomaviruses and cancer; Cancer Associated Viruses: Springer, US, 2012.
[3]
Choi, Y.J.; Ki, E.Y.; Zhang, C.; Ho, W.C.; Lee, S.J.; Jeong, M.J.; Chan, P.K.; Park, J.S. Analysis of sequence variation and risk association of human papillomavirus 52 variants circulating in Korea. PLoS One, 2016, 11, e0168178.
[4]
Burd, E.M. Human papillomavirus laboratory testing: The changing paradigm. Clin. Microbiol. Rev., 2016, 29, 291-319.
[5]
de Villiers, E.M.; Fauquet, C.; Broker, T.R.; Bernard, H.U.; zur Hausen, H. Classification of papillomaviruses. Virology, 2004, 324, 17-27.
[6]
Pillai, M.R.; Lakshmi, S.; Sreekala, S.; Devi, T.G.; Jayaprakash, P.G.; Rajalakshmi, T.N.; Devi, C.G.; Nair, M.K.; Nair, M.B. High-risk human papillomavirus infection and E6 protein expression in lesions of the uterine cervix. Pathobiology, 1998, 66, 240-246.
[7]
Tornesello, M.L.; Duraturo, M.L.; Botti, G.; Greggi, S.; Piccoli, R. De, Palo.G.; Montella, M.; Buonaguro, L.; Buonaguro, F.M. Italian HPV working group: prevalence of α-papillomavirus genotypes in cervical intraepithelial neoplasia and cervical cancer in the Italian population. J. Med. Virol., 2006, 78, 1663-1672.
[8]
Arbyn, M.; Tommasino, M.; Depuydt, C.; Dillner, J. Are 20 human papillomavirus types causing cervical cancer? J. Pathol., 2014, 234, 431-435.
[9]
Cogliano, V.; Baan, R.; Straif, K.; Grosse, Y.; Secretan, B.; Ghissassi, F.E. Carcinogenicity of human papillomaviruses. Lancet Oncol., 2005, 6, 204.
[10]
Schiffman, M.; Clifford, G.; Buonaguro, F.M. Classification of weakly carcinogenic human papillomavirus types: Addressing the limits of epidemiology at the borderline. Infect. Agent. Cancer, 2009, 4, 8.
[11]
Halec, G.; Alemany, L.; Lloveras, B.; Schmitt, M.; Alejo, M.; Bosch, F.X.; Tous, S.; Klaustermeier, J.E.; Guimerà, N.; Grabe, N.; Lahrmann, B.; Gissmann, L.; Quint, W.; Bosch, F.X.; de Sanjose, S.; Pawlita, M. Retrospective International Survey and HPV Time Trends Study Group. Retrospective International Survey and HPV Time Trends Study Group. Pathogenic role of the eight probably/possibly carcinogenic HPV types 26, 53, 66, 67, 68, 70, 73 and 82 in cervical cancer. J. Pathol., 2014, 234, 441-451.
[12]
Munoz, N.; Bosch, F.X. De, Sanjose, S.; Herrero, R.; Castellsague, X.; Shah, K.V.; Snijders, P.J.; Meijer, C.J. Epidemiologic classification of human papillomavirus types associated with cervical cancer. N. Engl. J. Med., 2003, 348, 518-527.
[13]
John, D.; Wim, Q.; Lawrence, B.; Ignacio, G.B.; Mark, S.; Broker, T.R.; Stanley, M.A. The biology and life-cycle of human papillomaviruses. Vaccine, 2012, 30S, F55-F70.
[14]
Zhong, T.Y.; Zhou, J.C.; Hu, R.; Fan, X.N.; Xie, X.Y.; Lin, M.; Chen, Y.G.; Hum, X.M.; Wang, W.H.; Li, L.; Xiao, H.P. Prevalence of human papillomavirus infection among 71,435 women in Jiangxi Province, China. J. Infect. Public Health, 2017, 10, 783-788.
[15]
Li, Z.; Liu, F.; Cheng, S.; Shi, L.; Yan, Z.; Yang, J.; Yao, Y.; Ma, Y. Prevalence of HPV infection among 28,457 Chinese women in Yunnan Province, southeast China. Sci. Rep., 2016, 6, 21039.
[16]
Liu, X.X.; Fan, X.L.; Yu, Y.P.; Ji, L.; Yan, J.; Sun, A.H. Human papillomavirus prevalence and type-distribution among women in Zhejiang Province, Southeast China: A cross-sectional study. BMC Infect. Dis., 2014, 14, 708.
[17]
So, K.A.; Hong, J.H.; Lee, J.K. Human papillomavirus prevalence and type distribution among 968 women in South Korea. J. Cancer Prev., 2016, 21, 104-109.
[18]
Azuma, Y.; Kusumoto-Matsuo, R.; Takeuchi, F.; Uenoyama, A.; Kondo, K.; Tsunoda, H.; Nagasaka, K.; Kawana, K.; Morisada, T.; Iwata, T.; Aoki, D.; Kukimoto, I. Human papillomavirus genotype distribution in cervical intraepithelial neoplasis grade 2/3 and invasive cervical cancer in Japanese women. Jpn. J. Clin. Oncol., 2014, 44, 910-917.
[19]
De Oliveira, G.R.; Vierira, V.C.; Avila, E.C.; Finger-Jardim, F.; Caldeira, T.D.; Gatti, F.A.; Gonçalves, C.V.; Oliveira, S.G.; Da Hora, V.P.; Soares, M.A.; De Martinez, A.M. Human papillomavirus type distribution and HPV16 intratype diversity in southern Brazil in women with and without cervical lesions. Mem. Inst. Oswaldo Cruz, 2017, 112, 492-498.
[20]
Krashias, G.; Koptides, D.; Christodoulou, C. HPV prevalence and type distribution in Cypriot women with cervical cytological abnormalities. BMC Infect. Dis., 2017, 17, 346.
[21]
Loya, A.; Serrano, B.; Rasgeed, F.; Tous, S.; Hassan, M.; Clavero, O.; Raza, M.; De Sanjosé, S.; Bosch, F.X.; Alemany, L. Human papillomavirus genotype distribution in invasive cervical cancer in Pakistan. Cancers (Basel), 2016, 8, pii E72.
[22]
Cordel, N.; Ragin, C.; Trival, M.; Tressieres, B.; Janky, E. High-risk human papillomavirus cervical infections among healthy women in Guadeloupe. Int. J. Infect. Dis., 2015, 41, 13-16.
[23]
Bosch, F.X.; Manos, M.M.; Muñoz, N.; Sherman, M.; Jansen, A.M.; Peto, J.; Schiffman, M.H.; Moreno, V.; Kurman, R.; Shah, K.V. Prevalence of human papillomavirus in cervical cancer: A worldwide perspective. J. Natl. Cancer Inst., 1995, 87, 796-802.
[24]
Furumoto, H.; Irahara, M. Human Papillomavirus (HPV) and cervical cancer. J. Med. Invest., 2002, 49, 124-133.
[25]
Muñoz, N.; Bosch, F.X.; De Sanjose, S.; Herrero, R.; Castellsague, X.; Shah, K.V.; Snijders, P.J.; Meijer, C.J. Epidemiologic classification of human papillomavirus types associated with cervical cancer. N. Engl. J. Med., 2003, 348, 518-527.
[26]
Centurioni, M.G.; Puppo, A.; Merlo, D.F.; Pasciucco, G.; Cusimano, E.R. Sirit,o R.; Gustavino, C.A. Prevalence of human papillomavirus cervical infection in an italian asymptomatic population. BMC Infect. Dis., 2005, 5, 77.
[27]
Yete, S.; D’Souza, W.; Saranath, D. High-risk human papillomavirus in oral cancer: Clinical implications. Oncology, 2018, 94, 133-141.
[28]
Tjalma, W.A.; Depuydt, C.E. Cervical cancer screening: Which HPV test should be used-L1 or E6/E7? Eur. J. Obstet. Gyn. R.B., 2013, 170, 45-46.
[29]
Park, S.B.; Hwang, S.; Zhang, B.T. Mining the risk types of Human Papillomavirus (HPV) by AdaCost. Lect. Notes Comput. Sci., 2003, 2736, 403-412.
[30]
Eom, J.H.; Park, S.B.; Zhang, B.T. Genetic mining of DNA sequence structures for effective classification of the risk types of Human Papillomavirus (HPV). Lect. Notes Comput. Sci., 2004, 3316, 1334-1343.
[31]
Kim, S.; Eom, J.H. Prediction of the human papillomavirus risk types using gap-spectrum kernels. Springer Berlin Heidelberg, 2006, 3973, 710-715.
[32]
Kim, S.; Zhang, B.T. Human papillomavirus risk type classification from protein sequences using support vector machines. Applications of Evolutionary Computing, Evoworkshops: Evobio, Evocomnet, Evohot, Evoiasp, Evointeraction, Evomusart, & Evostoc; Budapest, Hungary April. DBLP, 2006.
[33]
Esmaeili, M.; Mohabatkar, H.; Mohsenzadeh, S. Using the concept of Chou’s pseudo amino acid composition for risk type prediction of human papillomaviruses. J. Theor. Biol., 2010, 263, 203-209.
[34]
Alemi, M.; Mohabatkar, H.; Behbahani, M. In silico comparison of low- and high-risk human papillomavirus proteins. Appl. Biochem. Biotechnol., 2014, 172, 188-195.
[35]
Kim, S.; Kim, J.; Zhang, B.T. Ensembled support vector machines for human papillomavirus risk type prediction from protein secondary structures. Comput. Biol. Med., 2009, 39, 187-193.
[36]
Maj, L.E.; Hervé, D. HPV detection methods and genotyping techniques in screening for cervical cancer. Annales de pathologie., 2012, 32, e15-e23.
[37]
Yan, J.; Sardesai, N.Y. Human papillomavirus therapeutic vaccines: Targeting viral antigens as immunotherapy for precancerous disease and cancer Matthew P Morrow. Expert Rev. Vaccines, 2013, 12, 271-283.
[38]
Oscar, P.Z.; Víctor, H.B.; Carlos, P.P.; Jonathan, S.L.; Claudia, G.C.; Vicente, M.M. Targeted treatments for cervical cancer: a review. OncoTargets Ther., 2012, 5, 315-328.
[39]
Luciano, M.; Aldo, V. HPV vaccine: An overview of immune response, clinical protection, and new approaches for the future. J. Transl. Med., 2010, 8, 105-105.
[40]
Wang, P.; Xiao, X. Predicting the risk type of human papillomaviruses based on sequence-derived features, In: 5th International Conference on Bioinformatics and Biomedical Engineering Wuhan, China, May 10-12. 2011.
[41]
Dgusev, V.; Anemytikova, L. On the complexity measures of genetic sequences. Bioinformatics, 1999, 15, 994-999.
[42]
Leslie, C.; Eskin, E.; Noble, W.S. The Spectrum Kernel: A String Kernel for SVM Protein Classification. Pac. Symp. Biocomput., 2002, 2002, 564-575.
[43]
Leslie, C.S.; Eskin, E.; Cohen, A.; Weston, J.; Noble, W.S. Mismatch string kernels for discriminative protein classification. Bioinformatics, 2004, 20, 67-476.
[44]
Joung, J.G.; June, S.; Zhang, B.T. Protein sequence-based risk classification for human papillomaviruses. Comput. Biol. Med., 2006, 36, 656-667.
[45]
Thompson, J.D.; Higgins, D.G.; Gibson, T.J. CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res., 1994, 22, 4673-4680.
[46]
Wang, C.; Hai, Y.B.; Liu, X.Q.; Yao, Y.H.; He, P.A.; Dai, Q. Prediction of high-risk types of human papillomaviruses using statistical model of protein “sequence space”. Comput. Math. Methods Med., 2015, 2015, 756345.
[47]
Zheng, Y. Prediction of protein subcellular locations using Markov chain models. FEBS Lett., 1999, 451, 23-26.
[48]
Kuang, C.K. Research on prediction methods for the genotyping of human papilloma virus., Master Thesis, Zhejiang Sci-Tech University: Hangzhou, January. 2015.
[49]
Joung, J.G.; Sok, J.O.; Zhang, B.T. Prediction of the Risk Types of Human Papillomaviruses by Support Vector Machines, Trends in Artificial Intelligence, 8th Pacific Rim International Conference on Artificial Intelligence, Auckland, New Zealand, August 9-13. 2004.
[50]
Vapnik, V. The nature of statistical learning theory. Springer. 1995.
[51]
Kong, L.; Zhang, L.; Lv, J. Accurate prediction of protein structural classes by incorporating predicted secondary structure information into the general form of Chou’s pseudo amino acid composition. J. Theor. Biol., 2014, 344, 12-18.
[52]
Xu, H.M.; Yan, S.J.; Dai, Q.; He, P.A.; Liao, B.; Yao, Y.H. Protein subcellular location prediction based on pseudo amino acid composition and PSI-blast profile. J. Comput. Theor. Nanosci., 2015, 12, 1-7.
[53]
Larose, D.T. Discovering Knowledge in Data: An Introduction to Data Mining; John Wiley and Sons, Inc.: Hoboken, New Jersey, 2005.
[54]
Lai, H.Y.; Chen, X.X.; Chen, W.; Tang, H.; Lin, H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget, 2017, 8(17), 28169-28175.
[55]
Chen, W.; Tang, H.; Lin, H. MethyRNA: A web server for identification of N6-methyladenosine sites. J. Biomol. Struct. Dyn., 2017, 35(3), 683-687.
[56]
Lin, H.; Liang, Z.Y.; Tang, H.; Chen, W. Identifying sigma70 promoters with novel pseudo nucleotide composition. IEEE/ACM Trans; Comput. Biol. Bioinform, 2017.
[http://dx.doi.org/10.1109/TCBB.2017.2666141]
[57]
Chen, W.; Yang, H.; Feng, P.M.; Ding, H.; Lin, H. iDNA4mC: Identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics, 2017, 33(22), 3518-3523.
[58]
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. Identification of secretory proteins in mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res. Int., 2016, 2016, 5413903.
[59]
Chen, X.X.; Tang, H.; Li, W.C.; Wu, H.; Chen, W.; Ding, H.; Lin, H. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res. Int., 2016, 2016, 1654623.
[60]
Zhao, Y.W.; Lai, H.Y.; Tang, H.; Chen, W.; Lin, H. Prediction of phosphothreonine sites in human proteins by fusing different features. Sci. Rep., 2016, 6, 34817.
[61]
Qiu, W.R.; Sun, B.Q.; Tang, H.; Huang, J.; Lin, H. Identify and analysis crotonylation sites in histone by using support vector machines. Artif. Intell. Med., 2017, 83, 75-81.
[62]
Feng, P.; Yang, H.; Ding, H.; Lin, H.; Chen, W.; Chou, K.C. iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics, 2019, 111(1), 96-102.
[63]
Chou, K.C.; Zhang, C.T. Review: Prediction of protein structural classes. Crit. Rev. Biochem. Mol. Biol., 1995, 30, 275-349.


Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 20
ISSUE: 3
Year: 2019
Page: [236 - 243]
Pages: 8
DOI: 10.2174/1389200220666190118110012
Price: $58

Article Metrics

PDF: 17
HTML: 2