Acta Scientific Pharmaceutical Sciences (ASPS)(ISSN: 2581-5423)

Research Article Volume 4 Issue 1

Simulation Comparison of Statistical Approaches and Procedures in Building SNP based Prediction Models for Drug Response

Wencan Zhang1*, Pingye Zhang2, Feng Gao3, Yonghong Zhu4 and Ray Liu5

1Takeda Develop Center, One Takeda PKWY, Deerfield, USA
2Merck, Lincoln Avenue, Rahway NJ, USA
3Biogen, Cambridge, USA
4Shanghai Henlius Biotech Inc, Shanghai, China
5Takeda, Cambridge, USA

*Corresponding Author: Wencan Zhang, Takeda Develop Center, One Takeda PKWY, Deerfield, USA.

Received: December 13, 2019; Published: December 23, 2019



  Lack of replication on findings and missing heritability are two of the major challenges in Pharmacogenetics (PGx) studies related to developing predictive models for common disease prognosis and drug response. Recent innovations in statistical procedures and methodologies may help us understand and meet these challenges. We aimed using simulation based approaches with different prediction algorithms to compare their predictive accuracy. In our first simulation study, we compared four 1- step and one 2-step models built with five different approaches: Elastic Net (EN), Genome-wide Association Study (GWAS) + EN, Principal Component Regression (PCR), Random Forest (RF) and Support Vector Machine (SVM). The results showed that EN has the smallest test mean squared error (MSE), highest sensitivity and causal %. In the second simulation, we compared three 2-step approaches, GWAS+EN, GWAS+RF and GWAS+SVM. The GWAS+RF has the smallest test MSE and the best accuracy in picking up the seeded causal SNP variants. In the third simulation study, we compared two cross validation procedures: GWAS +EN vs. modified learn and confirm cross validation GWAS +EN (Modified CV GWAS+EN). The results showed that the latter approach has better prediction accuracy at the expense of a huge computational resource.

Keywords: Heritability; Sensitivity; SNPs



  1. Richard L Schilsky. “Personalized medicine in oncology: the future is now”. Nature Reviews Drug Discovery 9 (2010): 363-366.
  2. Schrodi SJ., et al. “Genetic-based prediction of disease traits: prediction is very difficult, especially about the future”. Frontiers in Genetics 5 (2014).
  3. Naomi R Wray., et al. “Pitfalls of predicting complex traits from SNPs”. Nature Reviews Genetics 14.7 (2013): 507-515.
  4. Sang Hong Lee., et al. “Estimating Missing Heritability for Disease from Genome-wide Association Studies”. The American Journal of Human Genetics 88 (2011): 294-305.
  5. Yang J., et al. “Common SNPs explain a large proportion of the heritability for human height”. Nature Genetics 42 (2010): 565-569.
  6. Visscher PM., et al. “A commentary on ‘common SNPs explain a large proportion of the heritability for human height’ by Yang et al (2010). Twin Research and Human Genetics 13 (2010): 517-524.
  7. G SY Pang., et al. “Predicting potentially functional SNPs in drug-response genes”. Phamacogenomics 10 (2009): 639-653.
  8. YW Francis Lam. “Scientific Challenges and Implementation Barriers to Translation of Pharmacogenomics in Clinical Practice”. ISRN Pharmacology (2013). 
  9. Lee SH., et al. “Estimating Missing Heritability for Disease from Genome-wide Association Studies". American Journal of Human Genetics 88 (2011): 294-305.
  10. Thanh-Tung Nguyen., et al. “Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests”. BMC Genomics 16 (2015): S5.
  11. Erdal Cosgun., et al. “High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans”. Bioinformatics 27 (2011): 1384-1389. 
  12. Daichi Shigemizu., et al. “The Construction of Risk Prediction Models Using GWAS Data and Its Application to a Type 2 Diabetes Prospective Cohort”. PLoS ONE 9.3 (2014): e9254.
  13. Charles Kooperberg., et al. “Risk Prediction using Genome-Wide Association Studies”. Genetics Epidemiology 34.7 (2010): 643-652.
  14. Zhi Wei., et al. “Large Sample Size, Wide Variant Advanced Machine-Learning Technique Boost Risk Prediction for Inflammatory Bowel Disease”. The American Journal of Human Genetics 92 (2013): 1008-1012.
  15. Xi Chen and Hemant Ishwaran. “Random forests for genomic data analysis”. Genomics 99 (2012): 323-329. 
  16. Iris Schrijver., et al. “Opportunities and Challenges Associated with Clinical Diagnostic Genome Sequencing”. The Journal of Molecular Diagnostics 14.6 (2012).
  17. Rita M Cantor., et al. “Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application”. The American Journal of Human Genetics 86 (2010): 6-22.
  18. Li L., et al. “A multi-marker molecular signature approach for treatment-specific subgroup identification with survival outcomes”. The Pharmacogenomics Journal 14 (2014): 439-45. 
  19. Zou H and Trevor T. “Regularization and Variable Selection via the Elastic Net”. Journal of the Royal Statistical Society Series B 67.2 (2005): 301-320.
  20. Christophe Ambroise and Geoffrey J McLachlan. "Selection bias in gene extraction on the basis of microarray gene-expression data". PNAS 99.10 (2002): 6562-6566.
  21. Ho Tin Kam. “Random Decision Forests”. Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, 14-16 August (1995): 278-282.
  22. Jolliffe Ian T. "A note on the Use of Principal Components in Regression". Journal of the Royal Statistical Society, Series C 31 (1982): 300-303.
  23. Cortes C and Vapnik V. "Support-vector networks". Machine Learning 20 (1995): 273-297.


Citation: Wencan Zhang.,et al. “Simulation Comparison of Statistical Approaches and Procedures in Building SNP based Prediction Models for Drug Response". Acta Scientific Pharmaceutical Sciences 4.1 (2020): 38-43.

Member In

News and Events

  • Certification for Review
    Acta Scientific certifies the Editors/reviewers for their review done towards the assigned articles of the respective journals.
  • Submission Timeline for March Issue
    The last date for submission of articles for regular Issues is March 15, 2020.
  • Publication Certificate
    Authors will be issued a "Publication Certificate" as a mark of appreciation for publishing their work.
  • Best Article of the Issue
    The Editors will elect one Best Article after each issue release. The authors of this article will be provided with a certificate of “Best Article of the Issue”.
  • Welcoming Article Submission
    Acta Scientific delightfully welcomes active researchers for submission of articles towards the upcoming issue of respective journals.
  • Contact US