Dairy Safety Prediction Based on Machine Learning Combined with Chemicals

Author(s): Jiahui Chen, Guangya Zhou, Jiayang Xie, Minjia Wang, Yanting Ding, Shuxian Chen, Sijing Xia, Xiaojun Deng*, Qin Chen*, Bing Niu*

Journal Name: Medicinal Chemistry

Volume 16 , Issue 5 , 2020

Become EABM
Become Reviewer
Call for Editor

Graphical Abstract:


Background: Dairy safety has caused widespread concern in society. Unsafe dairy products have threatened people's health and lives. In order to improve the safety of dairy products and effectively prevent the occurrence of dairy insecurity, countries have established different prevention and control measures and safety warnings.

Objective: The purpose of this study is to establish a dairy safety prediction model based on machine learning to determine whether the dairy products are qualified.

Methods: The 34 common items in the dairy sampling inspection were used as features in this study. Feature selection was performed on the data to obtain a better subset of features, and different algorithms were applied to construct the classification model.

Results: The results show that the prediction model constructed by using a subset of features including “total plate”, “water” and “nitrate” is superior. The SN, SP and ACC of the model were 62.50%, 91.67% and 72.22%, respectively. It was found that the accuracy of the model established by the integrated algorithm is higher than that by the non-integrated algorithm.

Conclusion: This study provides a new method for assessing dairy safety. It helps to improve the quality of dairy products, ensure the safety of dairy products, and reduce the risk of dairy safety.

Keywords: Dairy safety, machine learning, prediction, inspection, algorithm, chemicals.

Ding, T.; Yu, Y.Y.; Schaffner, D.W.; Chen, S.G.; Ye, X.Q.; Liu, D.H. Farm to consumption risk assessment for Staphylococcus aureus and staphylococcal enterotoxins in fluid milk in China. Food Control, 2016, 59, 636-643.
Huth, P.J.; DiRienzo, D.B.; Miller, G.D. Major scientific advances with dairy foods in nutrition and health. J. Dairy Sci., 2006, 89(4), 1207-1221.
[http://dx.doi.org/10.3168/jds.S0022-0302(06)72190-7] [PMID: 16537954]
Ayvaz, H.; Temizkan, R. Quick vacuum drying of liquid samples prior to ATR-FTIR spectral collection improves the quantitative prediction: a case study of milk adulteration. Int. J. Food Sci. Technol., 2018, 53(11), 2482-2489.
Nunes, M.M.; Caldas, E.D. Preliminary quantitative microbial risk assessment for staphylococcus enterotoxins in fresh minas cheese, a popular food in brazil. Food Control, 2017, 73, 524-531.
Xiu, C.B.; Klein, K.K. Melamine in milk products in China: Examining the factors that led to deliberate use of the contaminant. Food Policy, 2010, 35(5), 463-470.
Wu, X.L.; Lu, Y.Q.; Xu, H.X.; Lv, M.Y.; Hu, D.S.; He, Z.D.; Liu, L.Z.; Wang, Z.M.; Feng, Y. Challenges to improve the safety of dairy products in China. Trends Food Sci. Technol., 2018, 76, 6-14.
Zhong-Yi, L. Why does the carcinogenic aflatoxin in Mengniu milk exceed the standard? Available at:, https://www.guokr.com/article/82090/?page=5
Guruge, K.S.; Wu, Q.; Kannan, K. Occurrence and exposure assessment of perchlorate, iodide and nitrate ions from dairy milk and water in Japan and Sri Lanka. J. Environ. Monit., 2011, 13, 2312-2320.
[http://dx.doi.org/10.1039/C1EM10327J] [PMID: 21738937]
Yorifuji, T.; Kato, T.; Ohta, H.; Bellinger, D.C.; Matsuoka, K.; Grandjean, P. Neurological and neuropsychological functions in adults with a history of developmental arsenic poisoning from contaminated milk powder. Neurotoxicol. Teratol., 2016, 53, 75-80.
[http://dx.doi.org/10.1016/j.ntt.2015.12.001] [PMID: 26689609]
Velthuis, A.G.J.; van Asseldonk, M.A. Process audits versus product quality monitoring of bulk milk. J. Dairy Sci., 2011, 94(1), 235-249.
[http://dx.doi.org/10.3168/jds.2010-3528] [PMID: 21183034]
Geng, Z.Q.; Zhao, S.S.; Tao, G.C.; Han, Y.M. Early warning modeling and analysis based on analytic hierarchy process integrated extreme learning machine (AHP-ELM): Application to food safety. Food Control, 2017, 78, 33-42.
Viejo, C.G.; Fuentes, S.; Howell, K.; Torrico, D.; Dunshea, F.R. Robotics and computer vision techniques combined with non-invasive consumer biometrics to assess quality traits from beer foamability using machine learning: A potential for artificial intelligence applications. Food Control, 2018, 92, 72-79.
Kamruzzaman, M.; Makino, Y.; Oshita, S. Rapid and non-destructive detection of chicken adulteration in minced beef using visible near-infrared hyperspectral imaging and machine learning. J. Food Eng., 2016, 170, 8-15.
Ropodi, A.I.; Panagou, E.Z.; Nychas, G.J.E. Data mining derived from food analyses using non-invasive/non-destructive analytical techniques; determination of food authenticity, quality & safety in tandem with computer science disciplines. Trends Food Sci. Technol., 2016, 50, 11-25.
Linville, J.W.; Schumann, D.; Aston, C.; Defibaugh-Chavez, S.; Seebohm, S.; Touhey, L. Using a Six Sigma Fishbone Analysis Approach To Evaluate the Effect of Extreme Weather Events on Salmonella Positives in Young Chicken Slaughter Establishments. J. Food Prot., 2016, 79(12), 2048-2057.
[http://dx.doi.org/10.4315/0362-028X.JFP-16-173] [PMID: 28221958]
Feng, C-Q.; Zhang, Z-Y.; Zhu, X-J.; Lin, Y.; Chen, W.; Tang, H.; Lin, H. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics, 2019, 35(9), 1469-1477.
[PMID: 30247625]
Chen, W.; Lv, H.; Nie, F.; Lin, H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics, 2019, 35(16), 2796-2800.
[http://dx.doi.org/10.1093/bioinformatics/btz015] [PMID: 30624619]
Dao, F-Y.; Lv, H.; Wang, F.; Feng, C-Q.; Ding, H.; Chen, W.; Lin, H. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics, 2019, 35(12), 2075-2083.
Zuo, Y.; Li, Y.; Chen, Y.; Li, G.; Yan, Z.; Yang, L. PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition. Bioinformatics, 2017, 33(1), 122-124.
[http://dx.doi.org/10.1093/bioinformatics/btw564] [PMID: 27565583]
Tang, H.; Zhao, Y.W.; Zou, P.; Zhang, C.M.; Chen, R.; Huang, P.; Lin, H. HBPred: a tool to identify growth hormone-binding proteins. Int. J. Biol. Sci., 2018, 14(8), 957-964.
[http://dx.doi.org/10.7150/ijbs.24174] [PMID: 29989085]
Yang, H.; Lv, H.; Ding, H.; Chen, W.; Lin, H. iRNA-2OM: A Sequence-Based Predictor for Identifying 2′-O-Methylation Sites in Homo sapiens. J. Comput. Biol., 2018, 25(11), 1266-1277.
[http://dx.doi.org/10.1089/cmb.2018.0004] [PMID: 30113871]
Hansen, L.; Ferrao, M.F. Identification of possible milk adulteration using physicochemical data and multivariate analysis. Food Anal. Methods, 2018, 11(7), 1994-2003.
China Food and Drug Administration. Available at:, http://samr.cfda.gov.cn/WS01/CL1667/
Zou, Q. Latest machine learning techniques for biomedicine and bioinformatics. Curr. Bioinform., 2019, 14(3), 176-177.
Bhola, A.; Singh, S. gene selection using high dimensional gene expression data: an appraisal. Curr. Bioinform., 2018, 13(3), 225-233.
Rajappan, S.; Rangasamy, D. Adaptive genetic algorithm with exploration-exploitation tradeoff for preprocessing microarray datasets. Curr. Bioinform., 2017, 12(5), 441-451.
Tanchotsrinon, W.; Lursinsap, C.; Poovorawan, Y. An efficient prediction of HPV genotypes from partial coding sequences by chaos game representation and fuzzy k-nearest neighbor technique. Curr. Bioinform., 2017, 12(5), 431-440.
Huang, G.H.; Li, J.C. Feature extractions for computationally predicting protein post-translational modifications. Curr. Bioinform., 2018, 13(4), 387-395.
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27(8), 1226-1238.
[http://dx.doi.org/10.1109/TPAMI.2005.159] [PMID: 16119262]
Murphy, T.B.; Dean, N.; Raftery, A.E. Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications. Ann. Appl. Stat., 2010, 4(1), 396-421.
[http://dx.doi.org/10.1214/09-AOAS279] [PMID: 20936055]
Farooq, M.; Sazonov, E. Accelerometer-based detection of food intake in free-living individuals. IEEE Sens. J., 2018, 18(9), 3752-3758.
[http://dx.doi.org/10.1109/JSEN.2018.2813996] [PMID: 30364677]
Singha, S.; Shenoy, P.P. An adaptive heuristic for feature selection based on complementarity. Mach. Learn., 2018, 107(12), 2027-2071.
Liu, Y.; Chen, Y.H.; Tan, K.Z.; Xie, H.; Wang, L.G.; Yan, X.Z.; Xie, W.; Xu, Z. Maximum relevance, minimum redundancy band selection based on neighborhood rough set for hyperspectral data classification. Meas. Sci. Technol., 2016, 27(12), 13.
Liao, Z.; Wan, S.; He, Y.; Zou, Q. Classification of Small GTPases with Hybrid Protein Features and Advanced Machine Learning Techniques. Curr. Bioinform., 2018, 13(5), 492-500.
Naseem, I.; Khan, S.; Togneri, R.; Bennamoun, M. ECMSRC: A sparse learning approach for the prediction of extracellular matrix proteins. Curr. Bioinform., 2017, 12(4), 361-368.
Du, X.; Li, X.; Li, W.; Yan, Y.; Zhang, Y. Identification and analysis of cancer diagnosis using probabilistic classification vector machines with feature selection. Curr. Bioinform., 2018, 13(6), 625-632.
Kumar, N.; Hoque, M.A.; Shahjaman, M.; Islam, S.M.S.; Mollah, M.N.H. A New approach of outlier-robust missing value imputation for metabolomics data analysis. Curr. Bioinform., 2019, 14(1), 43-52.
Ozkan, A.; Isgor, S.B.; Sengul, G.; Isgor, Y.G. Benchmarking classification models for cell viability on novel cancer image datasets. Curr. Bioinform., 2019, 14(2), 108-114.
Yao, Y.; Li, X.; Geng, L.; Nan, X.; Qi, Z.; Liao, B. Recent Progress in Long Noncoding RNAs Prediction. Curr. Bioinform., 2018, 13(4), 344-351.
Niu, B.; Jin, Y.H.; Feng, K.Y.; Lu, W.C.; Cai, Y.D.; Li, G.Z. Using AdaBoost for the prediction of subcellular location of prokaryotic and eukaryotic proteins. Mol. Divers., 2008, 12(1), 41-45.
[http://dx.doi.org/10.1007/s11030-008-9073-0] [PMID: 18506593]
Niu, B.; Zhao, M.; Su, Q.; Zhang, M.; Lv, W.; Chen, Q.; Chen, F.; Chu, D.; Du, D.; Zhang, Y. 2D-SAR and 3D-QSAR analyses for acetylcholinesterase inhibitors. Mol. Divers., 2017, 21(2), 413-426.
[http://dx.doi.org/10.1007/s11030-017-9732-0] [PMID: 28275924]
Zuo, Y.C.; Peng, Y.; Liu, L.; Chen, W.; Yang, L.; Fan, G.L. Predicting peroxidase subcellular location by hybridizing different descriptors of Chou’ pseudo amino acid patterns. Anal. Biochem., 2014, 458, 14-19.
[http://dx.doi.org/10.1016/j.ab.2014.04.032] [PMID: 24802134]
Zuo, Y.C.; Su, W.X.; Zhang, S.H.; Wang, S.S.; Wu, C.Y.; Yang, L.; Li, G.P. Discrimination of membrane transporter protein types using K-nearest neighbor method derived from the similarity distance of total diversity measure. Mol. Biosyst., 2015, 11(3), 950-957.
[http://dx.doi.org/10.1039/C4MB00681J] [PMID: 25607774]
Zuo, Y.; Lv, Y.; Wei, Z.; Yang, L.; Li, G.; Fan, G. iDPFPseRAAAC: A Web-Server for identifying the defensin peptide family and subfamily using pseudo reduced amino acid alphabet composition. PLoS One, 2015, 10(12)e0145541
[http://dx.doi.org/10.1371/journal.pone.0145541] [PMID: 26713618]
Conroy, B.; Eshelman, L.; Potes, C.; Xu-Wilson, M. A dynamic ensemble approach to robust classification in the presence of missing data. Mach. Learn., 2016, 102(3), 443-463.
Huang, Q.P.; Chen, Q.S.; Li, H.H.; Huang, G.P.; Qin, O.Y.; Zhao, J.W. Non-destructively sensing pork’s freshness indicator using near infrared multispectral imaging technique. J. Food Eng., 2015, 154, 69-75.
Zhu, J.; Zou, H.; Rosset, S.; Hastie, T. Multi-class AdaBoost. Stat. Interface, 2009, 2(3), 349-360.
[http://dx.doi.org/10.4310/SII.2009.v2.n3.a8] [PMID: 20401316]
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 1997, 55(1), 119-139.
Schapire, R.E.; Freund, Y.; Bartlett, P.; Lee, W.S. Boosting the margin: A new explanation for the effectiveness of voting methods. Ann. Stat., 1998, 26(5), 1651-1686.
Kiambi, S.; Alarcon, P.; Rushton, J.; Murungi, M.K.; Muinde, P.; Akoko, J.; Aboge, G.; Gikonyo, S.; Momanyi, K.; Kang’ethe, E.K.; Fèvre, E.M. Mapping Nairobi’s dairy food system: An essential analysis for policy, industry and research. Agric. Syst., 2018, 167, 47-60.
[http://dx.doi.org/10.1016/j.agsy.2018.08.007] [PMID: 30739979]
Feng, L.; Zhu, S.; Zhang, C.; Bao, Y.; Gao, P.; He, Y. Variety identification of raisins using near-infrared hyperspectral imaging. Molecules, 2018, 23(11), 15.
[http://dx.doi.org/10.3390/molecules23112907] [PMID: 30412997]
Yurov, Y.B.; Vorsanova, S.G.; Iourov, I.Y. Network-based classification of molecular cytogenetic data. Curr. Bioinform., 2017, 12(1), 27-33.
Wang, X.; Liu, L.; Du, X.; Zhang, J.; Liu, J.; Ni, G.; Hao, R.; Liu, Y. Leukocyte recognition in human fecal samples using texture features. J. Opt. Soc. Am. A Opt. Image Sci. Vis., 2018, 35(11), 1941-1948.
[http://dx.doi.org/10.1364/JOSAA.35.001941] [PMID: 30461854]
Han, Z.Z.; Deng, L.M. Application driven key wavelengths mining method for aflatoxin detection using hyperspectral data. Comput. Electron. Agric., 2018, 153, 248-255.
Murala, S.; Wu, Q.M.J. Spherical symmetric 3D local ternary patterns for natural, texture and biomedical image indexing and retrieval. Neurocomputing, 2015, 149, 1502-1514.
Mathanker, S.K.; Weckler, P.R.; Bowser, T.J.; Wang, N.; Maness, N.O. AdaBoost classifiers for pecan defect classification. Comput. Electron. Agric., 2011, 77(1), 60-68.
Wang, B.; Lu, K.; Zheng, X.; Su, B.Y.; Zhou, Y.M.; Chen, P.; Zhang, J. Early stage identification of alzheimer’s disease using a two-stage ensemble classifier. Curr. Bioinform., 2018, 13(5), 529-535.
Verma, M.; Raman, B. Local neighborhood difference pattern: A new feature descriptor for natural and texture image retrieval. Multimedia Tools Appl., 2018, 77(10), 11843-11866.
Al-Salemi, B.; Ab Aziz, M.J.; Noah, S.A. LDA-AdaBoost.MH: Accelerated AdaBoost.MH based on latent Dirichlet allocation for text categorization. J. Inf. Sci., 2015, 41(1), 27-40.
Yuan, M.; Yang, Z.; Huang, G.; Ji, G. A novel feature selection method to predict protein structural class. Comput. Biol. Chem., 2018, 76, 118-129.
[http://dx.doi.org/10.1016/j.compbiolchem.2018.06.007] [PMID: 29990791]
Jiang, Y.; Li, C.Y. mRMR-based feature selection for classification of cotton foreign matter using hyperspectral imaging. Comput. Electron. Agric., 2015, 119, 191-200.
Chen, L.Y.; Zhao, Z.G.; Liu, F. mRMR-based wavelength selection for quantitative detection of Chinese yellow wine using NIRS. Anal. Methods, 2018, 10(6), 667-675.
Senawi, A.; Wei, H.L.; Billings, S.A. A new maximum relevance-minimum multicollinearity (MRmMC) method for feature selection and ranking. Pattern Recognit., 2017, 67, 47-61.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 07 August, 2020
Page: [664 - 676]
Pages: 13
DOI: 10.2174/1573406415666191004142810
Price: $65

Article Metrics

PDF: 20