ScanMap: Supervised Confounding Aware Non-negative Matrix Factorization for Polygenic Risk Modeling

Abstract: Molecular mechanisms are important to inform targeted intervention and are often encoded in gene sets or pathways. Existing machine learning approaches often face challenges in simultaneously reducing the high dimensionality and learning effective features that are discriminative in predicting the disease types with the usual presence of confounding variables. We aim to improve accuracy and interpretability of prediction models by introducing Supervised Confounding Aware Non-negative Matrix Factorization for Polygenic Risk Modeling (ScanMap) for genetic studies. ScanMap selects informative groups of genes that embody multiple interacting molecular functions by using a supervised model that integrates both groups of genes and confounding variables in predicting disease type and status. The learned groups of genes reflect interacting molecular mechanisms, which are suitable features for polygenic risk modeling. These learned features are then used in training a softmax classifier for disease type and status prediction. We evaluated ScanMap against multiple state-of-the-art unsupervised and supervised matrix factorization models using large scale NGS datasets. ScanMap outperformed all comparison models significantly (p < 0:05). Feature analysis was performed to illuminate the insights and benefits of gene groups learned by ScanMap in disease risk prediction.

06/12/2020

Fourier-transform-based attribution priors improve the interpretability and stability of deep learning models for genomics

Gaining Insight into SARS-CoV-2 Infection and COVID-19 Severity Using Self-supervised Edge Features and Graph Neural Networks

Michael Widrich, Bernhard Schäfl, Milena Pavlović and
Hubert Ramsauer, Lukas Gruber, Markus Holzleitner, Johannes Brandstetter, Geir Kjetil Sandve, Victor Greiff, Sepp Hochreiter, Günter Klambauer

Models for code, Differentiable program generator, Combinatorial optimization, Program obfuscation, Adversarial computer programs, Machine Learning (ML) for Programming Languages (PL)/Software Engineering (SE)

6:27

18/07/2021

Applied computing, Life and medical sciences, Computational biology, Computational transcriptomics, Computing methodologies, Machine learning, Machine learning algorithms, Feature selection, Machine learning approaches, Neural networks

8:07

06/12/2020

domain adaptation, adversarial attack, adversarial learning, unsupervised learning, model compression, generative adversarial networks

0:51

19/08/2021

Heuristic Search and Game Playing, Evaluation and Analysis, Heuristic Search and Machine Learning, Meta-Reasoning and Meta-Heuristics

13:51

26/04/2020