Adaptive Learning of Rank-One Models for Efficient Pairwise Sequence Alignment

06/12/2020

Adaptive Learning of Rank-One Models for Efficient Pairwise Sequence Alignment

Govinda Kamath, Tavor Baharav, Ilan Shomorony

Keywords:

Abstract Paper Similar Papers

Abstract: Pairwise alignment of DNA sequencing data is a ubiquitous task in bioinformatics and typically represents a heavy computational burden. State-of-the-art approaches to speed up this task use hashing to identify short segments (k-mers) that are shared by pairs of reads, which can then be used to estimate alignment scores. However, when the number of reads is large, accurately estimating alignment scores for all pairs is still very costly. Moreover, in practice, one is only interested in identifying pairs of reads with large alignment scores. In this work, we propose a new approach to pairwise alignment estimation based on two key new ingredients. The first ingredient is to cast the problem of pairwise alignment estimation under a general framework of rank-one crowdsourcing models, where the workers' responses correspond to k-mer hash collisions. These models can be accurately solved via a spectral decomposition of the response matrix. The second ingredient is to utilise a multi-armed bandit algorithm to adaptively refine this spectral estimator only for read pairs that are likely to have large alignments. The resulting algorithm iteratively performs a spectral decomposition of the response matrix for adaptively chosen subsets of the read pairs.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

A Convolutional Auto-Encoder for Haplotype Assembly and Viral Quasispecies Reconstruction

Ziqi Ke, Haris Vikalo

Keywords Paper

Applications -> Time Series Analysis; Theory -> Control Theory, Deep Learning -> Recurrent Networks

0

0

0

0

3:19

18/07/2021

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

Zhanpeng Zeng, Yunyang Xiong, Sathya Ravi and
Shailesh Acharya, Glenn Fung, Vikas Singh

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:16

06/12/2020

Fourier-transform-based attribution priors improve the interpretability and stability of deep learning models for genomics

Alex Tseng, Avanti Shrikumar, Anshul Kundaje

Keywords Paper

0

0

0

0

3:21

23/08/2020

MinSearch: An efficient algorithm for similarity search under edit distance

Haoyu Zhang, Qin Zhang

Keywords Paper

edit distance, top-k query, similarity search

0

0

0

0

19:41

06/12/2020

Diversity-Guided Multi-Objective Bayesian Optimization With Batch Evaluations

Mina Konakovic Lukovic, Yunsheng Tian, Wojciech Matusik

Keywords Paper

0

0

0

0

3:22

19/04/2021

Progressively pretrained dense corpus index for open-domain question answering

Wenhan Xiong, Hong Wang, William Yang Wang

Keywords Paper

0

0

0

0

12:15

06/12/2020

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli and
Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar

Keywords Paper

0

0

0

0

3:23

05/12/2020

Massively multilingual document alignment with cross-lingual sentence-mover’s distance

Ahmed El-Kishky, Francisco Guzmán

Keywords Paper

0

0

0

0

14:59

06/12/2021

Greedy Approximation Algorithms for Active Sequential Hypothesis Testing

Kyra Gan, Su Jia, Andrew Li

Keywords Paper

active learning

0

0

0

0

14:03

30/11/2020

Fast and Differentiable Message Passing on Pairwise Markov Random Fields

Zhiwei Xu, Thalaiyasingam Ajanthan, Richard Hartley

Keywords Paper

0

0

0

0

9:41

17/08/2020

NASOQ: Numerically accurate sparsity-oriented QP solver

Kazem Cheshmi, Danny M. Kaufman, Shoaib Kamil, Maryam Mehri Dehnavi

Keywords Paper

indefinite factorization, numerical optimization, contact simulation, sparse row modification, mesh deformation, quadratic programming, sparse linear algebra

0

0

0

0

15:27

14/09/2020

6VecLM: Language Modeling in Vector Space for IPv6 Target Generation

Tianyu Cui, Gang Xiong, Gaopeng Gou and
Junzheng Shi, Wei Xia

Keywords Paper

ipv6 target generation, deep learning, data mining, network measurement, natural language processing

0

0

0

0

13:04

16/11/2020

Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains

Weijie Yu, Chen Xu, Jun Xu and
Liang Pang, Xiaopeng Gao, Xiaozhao Wang, Ji-Rong Wen

Keywords Paper

real-world practices, text matching, matching models, match method

0

0

0

0

11:43

02/02/2021

Synchronous Interactive Decoding for Multilingual Neural Machine Translation

Hao He, Qian Wang, Zhipeng Yu and
Yang Zhao, Jiajun Zhang, Chengqing Zong

Keywords Paper

0

0

0

0

14:32

19/08/2021

CIMON: Towards High-quality Hash Codes

Xiao Luo, Daqing Wu, Zeyu Ma and
Chong Chen, Minghua Deng, Jinwen Ma, Zhongming Jin, Jianqiang Huang, Xian-Sheng Hua

Keywords Paper

Computer Vision, Recognition, Information Retrieval

0

0

0

0

14:20

26/08/2020

MAP Inference for Customized Determinantal Point Processes via Maximum Inner Product Search

Insu Han, Jennifer Gillenwater

Keywords Paper

0

0

0

0

16:01

19/08/2021

k-Nearest Neighbors by Means of Sequence to Sequence Deep Neural Networks and Memory Networks

Yiming Xu, Diego Klabjan

Keywords Paper

Machine Learning, Deep Learning, Class Imbalance and Unequal Cost, Classification

0

0

0

0

6:22

03/05/2021

Filtered Inner Product Projection for Crosslingual Embedding Alignment

Vin Sachidananda, Ziyi Yang, Chenguang Zhu

Keywords Paper

multilingual representations, natural language processing, word embeddings

0

0

0

0

5:22

16/11/2020

SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup

Rongzhi Zhang, Yue Yu, Chao Zhang

Keywords Paper

low-resource tasks, active labeling, mixup, sequence mixup

0

0

0

0

11:16

06/12/2021

Reverse-Complement Equivariant Networks for DNA Sequences

Vincent Mallet, Jean-Philippe Vert

Keywords Paper

deep learning, machine learning

0

0

0

0

13:00

03/05/2021

Random Feature Attention

Hao Peng, Nikolaos Pappas, Dani Yogatama and
Roy Schwartz, Noah Smith, Lingpeng Kong

Keywords Paper

machine translation, transformers, language modeling, Attention

0

0

0

0

10:20

26/04/2020

Decoding As Dynamic Programming For Recurrent Autoregressive Models

Najam Zaidi, Trevor Cohn, Gholamreza Haffari

Keywords Paper

Decoding

0

0

0

0

5:29

19/04/2021

Expanding, retrieving and infilling: Diversifying cross-domain question generation with flexible templates

Xiaojing Yu, Anxiao Jiang

Keywords Paper

0

0

0

0

11:40

12/07/2020

A Chance-Constrained Generative Framework for Sequence Optimization

Xianggen Liu, Jian Peng, Qiang Liu, Sen Song

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

12:40

26/10/2020

Through the Lens of Sequence Submodularity

Sara Bernardini, Fabio Fagnani, Chiara Piacentini

Keywords Paper

Greedy algorithms, Submodularity, Sequence functions, Search, Scheduling, Recommender Systems

0

0

0

0

10:31

26/04/2020

Learning-Augmented Data Stream Algorithms

Tanqiu Jiang, Yi Li, Honghao Lin and
Yisong Ruan, David P. Woodruff

Keywords Paper

streaming algorithms, heavy hitters, F_p moment, distinct elements, cascaded norms

0

0

0

0

3:55

14/06/2020

Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning

Thiago M. Paixão, Rodrigo F. Berriel, Maria C. S. Boeres and
Alessandro L. Koerich, Claudine Badue, Alberto F. De Souza, Thiago Oliveira-Santos

Keywords Paper

shredded document reconstruction, asymmetric metric learning, fully convolutional neural networks, jigsaw puzzle, compatibility evaluation

0

0

0

0

1:01

03/08/2020

High Dimensional Discrete Integration over the Hypergrid

Raj Kumar Maity, Arya Mazumdar, Soumyabrata Pal

Keywords Paper

0

0

0

0

8:46

16/11/2020

Best-First Beam Search

Clara Meister, Ryan Cotterell, Tim Vieira

Keywords Paper

nlp tasks, exact search, decoding, heuristic algorithm

0

0

0

0

12:19

23/08/2020

Efficient algorithm for the b-matching graph

Yasuhiro Fujiwara, Atsutoshi Kumagai, Sekitoshi Kanai and
Yasutoshi Ida, Naonori Ueda

Keywords Paper

efficient, algorithm, b-matching graph

0

0

0

0

15:28

14/07/2020

Closing the gap between cache-oblivious and cache-adaptive analysis

Michael A. Bender, Rezaul A. Chowdhury, Rathish Das and
Rob Johnson, William Kuszmaul, Andrea Lincoln, Quanquan C. Liu, Jayson Lynch, Helen Xu

Keywords Paper

cache-adaptive algorithms, smoothed analysis, cache-oblivious algorithms

0

0

0

0

18:06

12/07/2020

On Learning Language-Invariant Representations for Universal Machine Translation

Han Zhao, Junjie Hu, Andrej Risteski

Keywords Paper

Learning Theory

0

0

0

0

21:57

06/07/2020

Deep learning-based parameter mapping for joint relaxation and diffusion tensor MR Fingerprinting

Carolin M. Pirk, Pedro A. Gomez, Ilona Lipp and
Guido Buonincontri, Miguel Molina-Romero, Anjany Sekuboyina, Diana Waldmannstetter, Jonathan Dannenberg, Sebastian Endt, Alberto Merola, Joseph R. Whittaker, Valentina Tomassini, Michela Tosetti, Derek K. Jones, Bjoern H. Menze, Marion Menzel

Keywords Paper

0

0

0

0

4:59

04/07/2020

Paraphrase Generation by Learning How to Edit from Samples

Amirhossein Kazemnejad, Mohammadreza Salehi, Mahdieh Soleymani Baghshah

Keywords Paper

Paraphrase Generation, Neural sequence, sequence generation, retrieval-based method

0

0

0

0

12:20

06/12/2021

Quantifying and Improving Transferability in Domain Generalization

Guojun Zhang, Han Zhao, Yaoliang Yu, Pascal Poupart

Keywords Paper

domain adaptation, transfer learning

0

0

0

0

10:27

23/08/2020

Compositional embeddings using complementary partitions for memory-efficient recommendation systems

Hao-Jun Michael Shi, Dheevatsa Mudigere, Maxim Naumov, Jiyan Yang

Keywords Paper

embeddings, model compression, recommendation systems

0

0

0

0

16:14

26/10/2020

Contention-Aware Mapping and Scheduling Optimization for NoC-Based MPSoCs

Rongjie Yan, Yupeng Zhou, Anyu Cai and
Changwen Li, Yige Yan, Minghao Yin

Keywords Paper

MPSoCs, mapping and scheduling, multi-objective optimization, genetic algorithms, local search, exact methods

0

0

0

0

9:31

19/10/2020

MetaTPOT: Enhancing a tree-based pipeline optimization tool using meta-learning

Doron Laadan, Roman Vainshtein, Yarden Curiel and
Gilad Katz, Lior Rokach

Keywords Paper

tpot, meta-learning, genetic programming(gp), automl

0

0

0

0

6:41

12/07/2020

Population-Based Black-Box Optimization for Biological Sequence Design

Christof Angermueller, David Belanger, Andreea Gane and
Zelda Mariet, David Dohan, Kevin Murphy, Lucy Colwell , D. Sculley

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

0

7:17

04/07/2020

Effectively Aligning and Filtering Parallel Corpora under Sparse Data Conditions

Steinþór Steingrímsson, Hrafn Loftsson, Andy Way

Keywords Paper

Aligning Corpora, machine systems, data problem, alignment problem

0

0

0

0

11:47