PyClone-VI: scalable inference of clonal population structures using whole genome data

BMC Bioinformatics. 2020 Dec 10;21(1):571. doi: 10.1186/s12859-020-03919-2.

Abstract

Background: At diagnosis tumours are typically composed of a mixture of genomically distinct malignant cell populations. Bulk sequencing of tumour samples coupled with computational deconvolution can be used to identify these populations and study cancer evolution. Existing computational methods for populations deconvolution are slow and/or potentially inaccurate when applied to large datasets generated by whole genome sequencing data.

Results: We describe PyClone-VI, a computationally efficient Bayesian statistical method for inferring the clonal population structure of cancers. We demonstrate the utility of the method by analyzing data from 1717 patients from PCAWG study and 100 patients from the TRACERx study.

Conclusions: Our proposed method is 10-100× times faster than existing methods, while providing results which are as accurate. Software implementing our method is freely available https://github.com/Roth-Lab/pyclone-vi .

Keywords: Bayesian statistics; Cancer; Cancer evolution; Tumour heterogeneity.

MeSH terms

  • Bayes Theorem
  • Databases, Genetic*
  • Genome, Human*
  • Humans
  • Mutation
  • Neoplasms / genetics
  • Neoplasms / pathology
  • User-Computer Interface*
  • Whole Genome Sequencing