Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity

01/07/2020

Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity

Yuxia Wang, Fei Liu, Karin Verspoor, Timothy Baldwin

Keywords:

Abstract Paper Similar Papers

Abstract: In this paper, we apply pre-trained language models to the Semantic Textual Similarity (STS) task, with a specific focus on the clinical domain. In low-resource setting of clinical STS, these large models tend to be impractical and prone to overfitting. Building on BERT, we study the impact of a number of model design choices, namely different fine-tuning and pooling strategies. We observe that the impact of domain-specific fine-tuning on clinical STS is much less than that in the general domain, likely due to the concept richness of the domain. Based on this, we propose two data augmentation techniques. Experimental results on N2C2-STS 1 demonstrate substantial improvements, validating the utility of the proposed methods.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL Workshops virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

On Vocabulary Reliance in Scene Text Recognition

Zhaoyi Wan, Jielei Zhang, Liang Zhang and
Jiebo Luo, Cong Yao

Keywords Paper

scene text recognition, text spotting, document analysis, ocr, scene text detection, sequence recognition, language and vision

0

0

0

0

1:00

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

07/08/2020

Robust Benchmarking for Machine Learning of Clinical Entity Extraction

Monica Agrawal, Chloe O’Connell, Yasmin Fatemi and
Ariel Levy, David Sontag

Keywords Paper

0

0

0

0

2:56

02/02/2021

Merging Statistical Feature via Adaptive Gate for Improved Text Classification

Xianming Li, Zongxi Li, Haoran Xie, Qing Li

Keywords Paper

0

0

0

0

14:56

19/04/2021

Alignment verification to improve NMT translation towards highly inflectional languages with limited resources

George Tambouratzis

Keywords Paper

0

0

0

0

12:02

22/11/2021

Latent-optimization based Disease-aware Image Editing for Medical Image Augmentation

Aakash saboo, Prashnna K Gyawali, Ankit Shukla and
Manoj Sharma, Neeraj Jain, Linwei Wang

Keywords Paper

Latent optimization, StyleGAN, Image Editing, Chest X-ray, Image manipulation, constrained optimization, Disease progression, Disease quantification, Manifold, Latent space traversal

0

0

0

0

2:39

06/07/2020

DIVA: Domain Invariant Variational Autoencoders

Maximilian Ilse, Jakub M. Tomczak, Christos Louizos, Max Welling

Keywords Paper

0

0

0

0

8:45

04/07/2020

Calibrating Structured Output Predictors for Natural Language Processing

Abhyuday Jagannatha, Hong Yu

Keywords Paper

Natural Processing, natural applications, NLP applications, named recognition

0

0

0

0

11:44

04/07/2020

Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

Sajad Sotudeh Gharebagh, Nazli Goharian, Ross Filice

Keywords Paper

Content Selection, Clinical Summarization, text task, content problem

0

0

0

0

7:03

02/02/2021

Precise Yet Efficient Semantic Calibration and Refinement in ConvNets for Real-time Polyp Segmentation from Colonoscopy Videos

Huisi Wu, Jiafu Zhong, Wei Wang and
Zhenkun Wen, Jing Qin

Keywords Paper

0

0

0

0

17:40

08/12/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

0

0

0

0

15:01

26/04/2020

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov

Keywords Paper

0

0

0

0

5:00

16/11/2020

Unified Feature and Instance Based Domain Adaptation for Aspect-Based Sentiment Analysis

Chenggong Gong, Jianfei Yu, Rui Xia

Keywords Paper

aspect-based analysis, absa task, feature-based adaptation, auxiliary tasks

0

0

0

0

12:12

04/07/2020

Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset

Xiang Yue, Bernal Jimenez Gutierrez, Huan Sun

Keywords Paper

Clinical Comprehension, Machine comprehension, annotation, question answering

0

0

0

0

11:40

07/08/2020

Towards an Automated SOAP Note: Classifying Utterances from Medical Conversations

Benjamin Schloss, Sandeep Konam

Keywords Paper

0

0

0

0

2:57

19/10/2020

Distant supervision in BERT-based adhoc document retrieval

Koustav Rudra, Avishek Anand

Keywords Paper

distant supervision, adhoc retrieval, document ranking

0

0

0

0

6:49

07/09/2020

Semi-supervised Active Learning for Instance Segmentation via Scoring Predictions

Jun Wang, Shaoguo Wen, Jianghua Yu and
Kaixing Chen, Xin Zhou, Peng Gao, Guotong Xie, Changsheng Li

Keywords Paper

instance segmentation, active learning, semi-supervised learning, medical images

0

0

0

0

7:48

08/12/2020

SentiX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis

Jie Zhou, Junfeng Tian, Rui Wang and
Yuanbin Wu, Wenming Xiao, Liang He

Keywords Paper

0

0

0

0

12:42

04/07/2020

Structured Tuning for Semantic Role Labeling

Tao Li, Parth Anand Jawale, Martha Palmer, Vivek Srikumar

Keywords Paper

Semantic Labeling, Structured Tuning, expressive representations, knowledge-rich mechanisms

0

0

0

0

12:07

19/08/2021

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Caterina Lacerra, Tommaso Pasini, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:27

25/07/2020

A pairwise probe for understanding BERT fine-tuning on machine reading comprehension

Jie Cai, Zhengzhou Zhu, Ping Nie, Qian Liu

Keywords Paper

machine reading comprehension, pairwise, fine-tune, BERT

0

0

0

0

6:38

01/07/2020

Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT

Ashutosh Adhikari, Achyudh Ram, Raphael Tang and
William L. Hamilton, Jimmy Lin

Keywords Paper

0

0

0

0

4:55

02/02/2021

GRASP: Generic Framework for Health Status Representation Learning Based on Incorporating Knowledge from Similar Patients

Chaohe Zhang, Xin Gao, Liantao Ma and
Yasha Wang, Jiangtao Wang, Wen Tang

Keywords Paper

0

0

0

0

15:37

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

16/11/2020

To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai and
Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

Keywords Paper

learning representations, downstream tasks, cross-view cvt, sequence tasks

0

0

0

0

6:26

06/12/2021

SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning

Talip Ucar, Ehsan Hajiramezanali, Lindsay Edwards

Keywords Paper

self-supervised learning, contrastive learning, representation learning

0

0

0

0

13:28

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

03/05/2021

Neural Topic Model via Optimal Transport

He Zhao, Dinh Phung, Viet Huynh and
Trung Le, Wray Buntine

Keywords Paper

optimal transport, document analysis, topic modelling

0

0

0

1

9:29

05/12/2020

Beyond fine-tuning: Few-sample sentence embedding transfer

Siddhant Garg, Rohit Kumar Sharma, Yingyu Liang

Keywords Paper

0

0

0

0

9:56

18/07/2021

Which transformer architecture fits my data? A vocabulary bottleneck in self-attention

Noam Wies, Yoav Levine, Daniel Jannai, Amnon Shashua

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:11

08/12/2020

AutoMeTS: The Autocomplete for Medical Text Simplification

Hoang Van, David Kauchak, Gondy Leroy

Keywords Paper

0

0

0

0

13:29

16/11/2020

Sparse Text Generation

Pedro Henrique Martins, Zita Marinho, André F. T. Martins

Keywords Paper

story completion, dialogue generation, text generators, language models

0

0

0

0

11:27

19/08/2021

Modelling General Properties of Nouns by Selectively Averaging Contextualised Embeddings

Na Li, Zied Bouraoui, Jose Camacho-Collados and
Luis Espinosa-Anke, Qing Gu, Steven Schockaert

Keywords Paper

Natural Language Processing, Natural Language Semantics, Natural Language Processing

0

0

0

0

14:09

03/05/2021

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Yanru Qu, Dinghan Shen, Yelong Shen and
Sandra Sajeev, Weizhu Chen, Jiawei Han

Keywords Paper

consistency training, contrastive learning, data augmentation, natural language understanding

0

0

0

0

6:02

06/12/2021

Adaptive wavelet distillation from neural networks through interpretations

Wooseok Ha, Chandan Singh, Francois Lanusse and
Srigokul Upadhyayula, Bin Yu

Keywords Paper

deep learning, interpretability

0

0

0

0

14:56

02/02/2021

Noninvasive Self-attention for Side Information Fusion in Sequential Recommendation

Chang Liu, Xiaoguang Li, Guohao Cai and
Zhenhua Dong, Hong Zhu, Lifeng Shang

Keywords Paper

0

0

0

0

18:26

04/07/2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Arman Cohan, Sergey Feldman, Iz Beltagy and
Doug Downey, Daniel Weld

Keywords Paper

Document-level Learning, Representation learning, natural systems, classification

0

0

0

0

13:07

16/11/2020

Towards Better Context-aware Lexical Semantics:Adjusting Contextualized Representations through Static Anchors

Qianchu Liu, Diana McCarthy, Anna Korhonen

Keywords Paper

transformation, contextualized models, dynamic embeddings, post-processing technique

0

0

0

0

6:53

22/11/2021

FAR: A General Framework for Attributional Robustness

Adam Ivankay, Ivan Girardi, Chiara Marchiori, Pascal Frossard

Keywords Paper

robustness, attribution robustness, adversarial attacks, explainability, attribution maps

0

0

0

0

3:00

04/07/2020

Interpreting Pretrained Contextualized Representations via Reductions to Static Embeddings

Rishi Bommasani, Kelly Davis, Claire Cardie

Keywords Paper

Interpreting Representations, downstream applications, static embeddings, Pretrained Representations

0

0

0

0

12:07