VirDB: Crowdsourced Database for Evaluation of Dynamical Viral Infection Models

Author(s): Szymon Wasik*, Marcin Jaroszewski, Mateusz Nowaczyk, Natalia Szostak, Tomasz Prejzendanc, Jacek Blazewicz.

Journal Name: Current Bioinformatics

Volume 14 , Issue 8 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Background: Open science is an emerging movement underlining the importance of transparent, high quality research where results can be verified and reused by others. However, one of the biggest problems in replicating experiments is the lack of access to the data used by the authors. This problem also occurs during mathematical modeling of a viral infections. It is a process that can provide valuable insights into viral activity or into a drug’s mechanism of action when conducted correctly.

Objective: We present the VirDB database (, which has two primary objectives. First, it is a tool that enables collecting data on viral infections that could be used to develop new dynamic models of infections using the FAIR data sharing principles. Second, it allows storing references to descriptions of viral infection models, together with their evaluation results.

Methods: To facilitate the fast population of database and the ease of exchange of scientific data, we decided to use crowdsourcing for collecting data. Such approach has already been proved to be very successful in projects such as Wikipedia.

Conclusion: VirDB builds on the concepts and recommendations of Open Science and shares data using the FAIR principles. Thanks to this storing data required for designing and evaluating models of viral infections which can be freely available on the Internet.

Keywords: Viral dynamics, database, open science, evaluation, viral infections modelling, bioinformatics.

Adams MJ, Lefkowitz EJ, King AMQ, Carstens EB. Ratification vote on taxonomic proposals to the International Committee on Taxonomy of Viruses (2014). Arch Virol 2014; 159(10): 2831-41.
Dimmock N, Easton AJ, Leppard K. Introduction to Modern Virology. 6th ed. Blackwell Publishing 2007.
Urbanowicz A, Alejska M, Formanowicz P, Błażewicz J, Figlerowicz M, Bujarski JJ. Homologous crossovers among molecules of brome mosaic bromovirus RNA1 or RNA2 segments in vivo. J Virol 2005; 79(9): 5732-42.
HIV/AIDS JUNP on. UNAIDS World AIDS Day Report United Nations. 2011.
Aylward B, Barboza P, Bawo L, et al. WHO Ebola Response Team. Ebola virus disease in West Africa--the first 9 months of the epidemic and forward projections. N Engl J Med 2014; 371(16): 1481-95.
Lefebvre A, Fiet C, Belpois-Duchamp C, Tiv M, Astruc K, Aho Glélé LS. Case fatality rates of Ebola virus diseases: a meta-analysis of World Health Organization data. Med Mal Infect 2014; 44(9): 412-6.
Gulland A. Ebola outbreak in west Africa is officially over. BMJ 2016; 352: i243.
Rasmussen SA, Jamieson DJ, Honein MA, Petersen LR. Zika Virus and Birth Defects--Reviewing the Evidence for Causality. N Engl J Med 2016; 374(20): 1981-7.
Kaslow RA. Epidemiology and Control: Principles, Practice and Programs.Viral Infections of Humans: Epidemiology and Control Springer US. Boston, MA 2014; 3-8.
Altman DG. Statistics and ethics in medical research. Collecting and screening data. BMJ 1980; 281(6252): 1399-401.
Nelson EC, Splaine ME, Batalden PB, Plume SK. Building measurement and data collection into medical practice. Ann Intern Med 1998; 128(6): 460-6.
Cios KJ, Moore GW. Uniqueness of medical data mining. Artif Intell Med 2002; 26(1-2): 1-24.
Neylon C, Wu S. Open Science: tools, approaches, and implications. Pac Symp Biocomput 2009; 540-4.
Kraker P, Leony D, Reinhardt W. Beham nter G. The Case for an Open Science in Technology Enhanced Learning. Int J Technol Enhanc Learn 2011; 3: 643-54.
Wodak SJ, Mietchen D, Collings AM, Russell RB, Bourne PE. Topic pages: PLoS Computational Biology meets Wikipedia. PLOS Comput Biol 2012; 8(3)e1002446
Szostak N, Wasik S, Blazewicz J. Hypercycle. PLOS Comput Biol 2016; 12(4)e1004853
McCullough BD. Got replicability? the journal of money, credit and banking archive. Econ J Watch 2007; 4: 326.
McCullough BD, McGeary KA, Harrison TD. Lessons from the JMCB Archive. J Money Credit Bank 2006; 38: 1093-107.
McCullough BD, McGeary KA, Harrison TD. Do Economics Journal Archives Promote Replicable Research?Social Science Research Network: Rochester, NY . 2006. Report No. ID 931231.
Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature 2012; 483(7391): 531-3.
Mullard A. Reliability of ‘new drug target’ claims called into question. Nat Rev Drug Discov 2011; 10(9): 643-4.
Spellman BA. Introduction to the Special Section: Data, Data, Everywhere. Especially in My File Drawer. Perspect Psychol Sci 2012; 7(1): 58-9.
White HC, Carrier S, Thompson A, Greenberg J, Scherle R. The Dryad Data Repository: A Singapore Framework Metadata Architecture in a DSpace Environment. Int Conf Dublin Core Metadata Appl 2008; pp. 157-62.
Pampel H, Vierkant P, Scholze F, et al. Making research data repositories visible: the Registry. PLoS One 2013; 8(11)e78080
Wynholds L, Fearon DS Jr, Borgman CL, Traweek S. When Use Cases Are Not Useful: Data Practices, Astronomy, and Digital Libraries. Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries ACM. New York, NY, USA. 2011; pp. 383-6.
Pontika N, Knoth P, Cancellieri M, Pearce S. Fostering Open Science to Research Using a Taxonomy and an eLearning Portal. Proceedings of the 15th International Conference on Knowledge Technologies and Data-driven Business ACM. New York, NY, USA. 2015; pp. 1-11.
Borgman CL. The Conundrum of Sharing Research DataSocial Science Research Network: Rochester, NY . 2011. Report No.: ID 1869155.
Edwards PN, Mayernik MS, Batcheller AL, Bowker GC, Borgman CL. Science friction: data, metadata, and collaboration. Soc Stud Sci 2011; 41(5): 667-90.
Musen MA, Bean CA, Cheung K-H, et al. CEDAR team. The center for expanded data annotation and retrieval. J Am Med Inform Assoc 2015; 22(6): 1148-52.
Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 2016.3160018
Wasik S, Prejzendanc T, Blazewicz J. ModeLang – Experts-friendly language for describing viral infection models. Comput Math Methods Med 2013; 2013: 8.
Prejzendanc T, Wasik S, Blazewicz J. Computer representations of bioinformatics models. Curr Bioinform 2016; 11: 551-60.
Virus dynamics: Mathematical principles of immunology and virology. Oxford, New York: Oxford University Press 2001.
Perelson AS, Ribeiro RM. Modeling the within-host dynamics of HIV infection. BMC Biol 2013; 11: 96.
Ciupe SM, Ribeiro RM, Nelson PW, Perelson AS. Modeling the mechanisms of acute hepatitis B virus infection. J Theor Biol 2007; 247(1): 23-35.
Nowak MA, Bonhoeffer S, Hill AM, Boehme R, Thomas HC, McDade H. Viral dynamics in hepatitis B virus infection. Proc Natl Acad Sci USA 1996; 93(9): 4398-402.
Chatterjee A, Smith PF, Perelson AS. Hepatitis C viral kinetics: the past, present, and future. Clin Liver Dis 2013; 17(1): 13-26.
Neumann AU, Lam NP, Dahari H, et al. Hepatitis C viral dynamics in vivo and the antiviral efficacy of interferon-alpha therapy. Science 1998; 282(5386): 103-7.
Madelain V, Oestereich L, Graw F, et al. Ebola virus dynamics in mice treated with favipiravir. Antiviral Res 2015; 123: 70-7.
Banton S, Roth Z, Pavlovic M. Mathematical modeling of Ebola virus dynamics as a step towards rational vaccine design. 26th Southern Biomedical Engineering Conference SBEC 2010. April 30-May 2, 2010; College Park, Maryland, USA. 196-200.
Box GE. Robustness in the strategy of scientific model building. Robustness in statistics Elsevier. 1979; 201-36.
Harrell FE. Regression modeling strategies. BIOS 2017; 330: 2018.
Efroymson M. Multiple regression analysis. Math Methods Digit Comput 1960; pp. 191-203.
Stone M. Cross-Validatory Choice and Assessment of Statistical Predictions. J R Stat Soc B 1974; 36: 111-47.
Geisser S. The Predictive Sample Reuse Method with Applications. J Am Stat Assoc 1975; 70: 320-8.
Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat Surv 2010; 4: 40-79.
Akaike H. Information Theory and an Extension of the Maximum Likelihood Principle. Selected Papers of Hirotugu Akaike Springer New York 1998; pp. 199-213.
Toffoli T. Cellular automata as an alternative to (rather than an approximation of) differential equations in modeling physics. Phys Nonlinear Phenom 1984; 10: 117-27.
Wasik S, Fratczak F, Krzyskow J, Wulnikowski J. Inferring Mathematical Equations Using Crowdsourcing. PLoS One 2015; 10(12)e0145557
Wasik S, Jackowiak P, Figlerowicz M, Blazewicz J. Multi-agent model of hepatitis C virus infection. Artif Intell Med 2014; 60(2): 123-31.
Dahari H, Ribeiro RM, Perelson AS. Triphasic decline of hepatitis C virus RNA during antiviral therapy. Hepatology 2007; 46(1): 16-21.
Bauer B, Reynolds M. Recovering data from scanned graphs: performance of Frantz’s g3data software. Behav Res Methods 2008; 40(3): 858-68.
Martyushev AP, Petravic J, Grimm AJ, et al. Epitope-specific CD8+ T cell kinetics rather than viral variability determine the timing of immune escape in simian immunodeficiency virus infection. J Immunol 2015; 194(9): 4112-21.
Martyushev AP, Petravic J, Grimm AJ, et al. Data from: Epitope-specific CD8+ T cell kinetics rather than viral variability determine the timing of immune escape in Simian Immunodeficiency Virus infection. J Immunol 2015; 194(9): 4112-21.
Doan A, Ramakrishnan R, Halevy AY. Crowdsourcing Systems on the World-Wide Web. Commun ACM 2011; 54: 86-96.
Altman DG, Andersen PK. Bootstrap investigation of the stability of a Cox regression model. Stat Med 1989; 8(7): 771-83.
Tibshirani R. Regression Shrinkage and Selection via the Lasso. J R Stat Soc B 1996; 58: 267-88.
Derksen S, Keselman HJ. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. Br J Math Stat Psychol 1992; 45: 265-82.
Aho K, Derryberry D, Peterson T. Model selection for ecologists: the worldviews of AIC and BIC. Ecology 2014; 95(3): 631-6.
Macey R, Oster G, Zahnley T. Berkeley Madonna User’s Guide. Berkeley: University of California 2009.
Guedj J, Dahari H, Pohl RT, Ferenci P, Perelson AS. Understanding silibinin’s modes of action against HCV using viral kinetic modeling. J Hepatol 2012; 56(5): 1019-24.
Pinheiro J, Bates D. DebRoy S, Sarkar D, Team RC, others nlme: Linear and nonlinear mixed effects models. R Package Version 2012; p. 3.
Burnham KP, Anderson DR. Multimodel Inference: Understanding AIC and BIC in Model Selection. Sociol Methods Res 2004; 33: 261-304.
Boban M. Digital single market and EU data protection reform with regard to the processing of personal data as the challenge of the modern world. Econ Soc Dev Book Proc 2016; p. 191.
Hallinan D, Friedewald M. Open consent, biobanking and data protection law: can open consent be ‘informed’ under the forthcoming data protection regulation? Life Sci Soc Policy 2015; 11: 1.
Wojciechowski P, Frohmberg W, Kierzynka M, Zurkowski P, Blazewicz J. G-MAPSEQ – a new method for mapping reads to a reference genome. Found Comput Decis Sci 2016; 41: 123-42.
Wasik S, Jackowiak P, Krawczyk JB, et al. Towards prediction of HCV therapy efficiency. Comput Math Methods Med 2010; 11(2): 185-99.
Komarova NL, Levy DN, Wodarz D. Synaptic transmission and the susceptibility of HIV infection to anti-viral drugs. Sci Rep 2013; 3: 2103.
Komarova NL, Anghelina D, Voznesensky I, Trinité B, Levy DN, Wodarz D. Relative contribution of free-virus and synaptic transmission to the spread of HIV-1 through target cell populations. Biol Lett 2013.920121049
Dale BM, Alvarez RA, Chen BK. Mechanisms of enhanced HIV spread through T-cell virological synapses. Immunol Rev 2013; 251(1): 113-24.
Sattentau QJ. Cell-to-Cell Spread of Retroviruses. Viruses 2010; 2(6): 1306-21.
Feldmann J, Schwartz O. HIV-1 Virological Synapse: Live Imaging of Transmission. Viruses 2010; 2(8): 1666-80.
Dixit NM, Perelson AS. Multiplicity of human immunodeficiency virus infections in lymphoid tissue. J Virol 2004; 78(16): 8942-5.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [740 - 748]
Pages: 9
DOI: 10.2174/1574893614666190308155904
Price: $58

Article Metrics

PDF: 14