Reconstruction and decomposition of high-dimensional landscapes via unsupervised learning

Abstract: Uncovering the organization of a landscape that encapsulates all states of a dynamic system is a central task in many domains, as it promises to reveal, in an unsupervised manner, a system’s inner working. One domain where this task is crucial is in bioinformatics, where the energy landscape that organizes three-dimensional structures of a molecule by their energetics is a powerful construct. The landscape can be leveraged, among other things, to reveal macrostates where a molecule is biologically-active. This is a daunting task, as landscapes of complex actuated systems, such as molecules, are inherently high-dimensional. Nonetheless, our laboratories have made some progress via topological and statistical analysis of spatial data over the recent years. We have proposed what is essentially a dichotomy, methods that are more pertinent for visualization-driven discovery, and methods that are more pertinent for discovery of the biologically-active macrostates but not amenable to visualization. In this paper, we present a novel, hybrid method that combines strengths of these methods, allowing both visualization of the landscape and discovery of macrostates. We demonstrate what the method is capable of uncovering in comparison with existing methods over structure spaces sampled with conformational sampling algorithms. Though the direct evaluation in this paper is on protein energy landscapes, the proposed method is of broad interest in cross-cutting problems that necessitate characterization of fitness and optimization landscapes.

26/04/2020

Probabilistic Methods, Gaussian Processes, Algorithms, Multitask and Transfer Learning; Probabilistic Methods, Variational Inference, Applications, Computational Biology and Bioinformatics

5:09

06/12/2021

Pedro Hermosilla Casajus, Marco Schäfer, Matej Lang and
Gloria Fackelmann, Pere-Pau Vázquez, Barbora Kozlikova, Michael Krone, Tobias Ritschel, Timo Ropinski

Amalgamation of protein sequence, structure and textual information for improving protein-protein interaction identification

Ziyu Xiang, Mingzhou Fan, Guillermo Vázquez Tovar and
William Trehern, Byung-Jun Yoon, Xiaofeng Qian, Raymundo Arroyave, Xiaoning Qian

Sai Krishna Gottipati, Boris Sattarov, Sufeng Niu and
Haoran Wei, Yashaswi Pathak, Shengchao Liu, Shengchao Liu, Simon Blackburn, Karam Thomas, Connor Coley, Jian Tang, Sarath Chandar, Yoshua Bengio