Skip to main content

Main menu

  • Home
  • Current Issue
  • Archive
  • Info for
    • Authors
    • Editorial Policies
    • Advertisers
    • Editorial Board
    • Special Issues
  • Journal Metrics
  • Other Publications
    • Anticancer Research
    • In Vivo
    • Cancer Diagnosis & Prognosis
  • More
    • IIAR
    • Conferences
  • About Us
    • General Policy
    • Contact
  • Other Publications
    • Cancer Genomics & Proteomics
    • Anticancer Research
    • In Vivo

User menu

  • Register
  • Subscribe
  • My alerts
  • Log in
  • My Cart

Search

  • Advanced search
Cancer Genomics & Proteomics
  • Other Publications
    • Cancer Genomics & Proteomics
    • Anticancer Research
    • In Vivo
  • Register
  • Subscribe
  • My alerts
  • Log in
  • My Cart
Cancer Genomics & Proteomics

Advanced Search

  • Home
  • Current Issue
  • Archive
  • Info for
    • Authors
    • Editorial Policies
    • Advertisers
    • Editorial Board
    • Special Issues
  • Journal Metrics
  • Other Publications
    • Anticancer Research
    • In Vivo
    • Cancer Diagnosis & Prognosis
  • More
    • IIAR
    • Conferences
  • About Us
    • General Policy
    • Contact
  • Visit iiar on Facebook
  • Follow us on Linkedin
Review ArticleReview

Machine Learning Approaches on High Throughput NGS Data to Unveil Mechanisms of Function in Biology and Disease

VASILEIOS C. PEZOULAS, ORSALIA HAZAPIS, NEFELI LAGOPATI, THEMIS P. EXARCHOS, ANDREAS V. GOULES, ATHANASIOS G. TZIOUFAS, DIMITRIOS I. FOTIADIS, IOANNIS G. STRATIS, ATHANASIOS N. YANNACOPOULOS and VASSILIS G. GORGOULIS
Cancer Genomics & Proteomics September 2021, 18 (5) 605-626; DOI: https://doi.org/10.21873/cgp.20284
VASILEIOS C. PEZOULAS
1Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, Ioannina, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
ORSALIA HAZAPIS
2Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
NEFELI LAGOPATI
2Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece;
3Biomedical Research Foundation of the Academy of Athens, Athens, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
THEMIS P. EXARCHOS
4Unit of Medical Technology and Intelligent Information Systems, Department of Materials Science and Engineering, University of Ioannina, Ioannina, Greece;
5Department of Informatics, Ionian University, Corfu, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
ANDREAS V. GOULES
6Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
ATHANASIOS G. TZIOUFAS
6Department of Pathophysiology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DIMITRIOS I. FOTIADIS
1Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, Ioannina, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
IOANNIS G. STRATIS
7Department of Mathematics, National and Kapodistrian University of Athens, Athens, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
ATHANASIOS N. YANNACOPOULOS
8Department of Statistics, and Stochastic Modelling and Applications Laboratory, Athens University of Economics and Business (AUEB), Athens, Greece;
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: ayannaco{at}aueb.gr
VASSILIS G. GORGOULIS
2Molecular Carcinogenesis Group, Department of Histology and Embryology, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece;
3Biomedical Research Foundation of the Academy of Athens, Athens, Greece;
9Division of Cancer Sciences, Faculty of Biology, Medicine and Health, Manchester Academic Health Science Centre, Manchester Cancer Research Centre, NIHR Manchester Biomedical Research Centre, University of Manchester, Manchester, U.K.;
10Center for New Biotechnologies and Precision Medicine, Medical School, National and Kapodistrian University of Athens, Athens, Greece;
11Faculty of Health and Medical Sciences, University of Surrey, Surrey, U.K.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: vgorg{at}med.uoa.gr
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

Abstract

In this review, the fundamental basis of machine learning (ML) and data mining (DM) are summarized together with the techniques for distilling knowledge from state-of-the-art omics experiments. This includes an introduction to the basic mathematical principles of unsupervised/supervised learning methods, dimensionality reduction techniques, deep neural networks architectures and the applications of these in bioinformatics. Several case studies under evaluation mainly involve next generation sequencing (NGS) experiments, like deciphering gene expression from total and single cell (scRNA-seq) analysis; for the latter, a description of all recent artificial intelligence (AI) methods for the investigation of cell sub-types, biomarkers and imputation techniques are described. Other areas of interest where various ML schemes have been investigated are for providing information regarding transcription factors (TF) binding sites, chromatin organization patterns and RNA binding proteins (RBPs), while analyses on RNA sequence and structure as well as 3D dimensional protein structure predictions with the use of ML are described. Furthermore, we summarize the recent methods of using ML in clinical oncology, when taking into consideration the current omics data with pharmacogenomics to determine personalized treatments. With this review we wish to provide the scientific community with a thorough investigation of main novel ML applications which take into consideration the latest achievements in genomics, thus, unraveling the fundamental mechanisms of biology towards the understanding and cure of diseases.

Key Words:
  • Machine learning
  • supervised-unsupervised learning
  • NGS
  • gene expression
  • scRNA-seq
  • TFs
  • RBPs
  • RNA structure
  • sequence motifs
  • review
  • Received June 25, 2021.
  • Revision received July 21, 2021.
  • Accepted August 3, 2021.
  • Copyright © 2021 The Author(s). Published by the International Institute of Anticancer Research.
View Full Text
PreviousNext
Back to top

In this issue

Cancer Genomics & Proteomics
Vol. 18, Issue 5
September-October 2021
  • Table of Contents
  • Table of Contents (PDF)
  • Index by author
  • Back Matter (PDF)
  • Ed Board (PDF)
  • Front Matter (PDF)
Print
Download PDF
Article Alerts
Sign In to Email Alerts with your Email Address
Email Article

Thank you for your interest in spreading the word on Cancer Genomics & Proteomics.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Machine Learning Approaches on High Throughput NGS Data to Unveil Mechanisms of Function in Biology and Disease
(Your Name) has sent you a message from Cancer Genomics & Proteomics
(Your Name) thought you would like to see the Cancer Genomics & Proteomics web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
2 + 1 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
Citation Tools
Machine Learning Approaches on High Throughput NGS Data to Unveil Mechanisms of Function in Biology and Disease
VASILEIOS C. PEZOULAS, ORSALIA HAZAPIS, NEFELI LAGOPATI, THEMIS P. EXARCHOS, ANDREAS V. GOULES, ATHANASIOS G. TZIOUFAS, DIMITRIOS I. FOTIADIS, IOANNIS G. STRATIS, ATHANASIOS N. YANNACOPOULOS, VASSILIS G. GORGOULIS
Cancer Genomics & Proteomics Sep 2021, 18 (5) 605-626; DOI: 10.21873/cgp.20284

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Reprints and Permissions
Share
Machine Learning Approaches on High Throughput NGS Data to Unveil Mechanisms of Function in Biology and Disease
VASILEIOS C. PEZOULAS, ORSALIA HAZAPIS, NEFELI LAGOPATI, THEMIS P. EXARCHOS, ANDREAS V. GOULES, ATHANASIOS G. TZIOUFAS, DIMITRIOS I. FOTIADIS, IOANNIS G. STRATIS, ATHANASIOS N. YANNACOPOULOS, VASSILIS G. GORGOULIS
Cancer Genomics & Proteomics Sep 2021, 18 (5) 605-626; DOI: 10.21873/cgp.20284
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • A General Overview of Machine Learning (the Basic Principles)
    • Supervised Machine Learning Principles
    • Unsupervised Learning
    • Machine Learning Applications in Biology and Bioinformatics
    • Concluding Remarks
    • Supplementary Material
    • Acknowledgements
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • PDF

Related Articles

Cited By...

  • Four Different Artificial Intelligence Models Versus Logistic Regression to Enhance the Diagnostic Accuracy of Fecal Immunochemical Test in the Detection of Colorectal Carcinoma in a Screening Setting
  • Google Scholar

More in this TOC Section

  • Myxoid Pleomorphic Liposarcoma: A Review and Update
  • Association of FOXP3 rs3761548 With Cancer: Systematic Review and Two Approaches of X-chromosome Genotypic Meta-analysis
  • Bladder Cancer: Role of Circular RNAs in Oncogenesis, Tumor Suppression, and Therapeutic Target Identification
Show more Review

Keywords

  • machine learning
  • supervised-unsupervised learning
  • NGS
  • gene expression
  • scRNA-seq
  • TFs
  • RBPs
  • RNA structure
  • sequence motifs
  • review
Cancer & Genome Proteomics

© 2026 Cancer Genomics & Proteomics

Powered by HighWire