Ensemble BERT for Classifying Medication-mentioning Tweets

08/12/2020

Ensemble BERT for Classifying Medication-mentioning Tweets

Huong Dang, Kahyun Lee, Sam Henry, Özlem Uzuner

Keywords:

Abstract Paper Similar Papers

Abstract: Twitter is a valuable source of patient-generated data that has been used in various population health studies. The first step in many of these studies is to identify and capture Twitter messages (tweets) containing medication mentions. In this article, we describe our submission to Task 1 of the Social Media Mining for Health Applications (SMM4H) Shared Task 2020. This task challenged participants to detect tweets that mention medications or dietary supplements in a natural, highly imbalance dataset. Our system combined a handcrafted preprocessing step with an ensemble of 20 BERT-based classifiers generated by dividing the training dataset into subsets using 10-fold cross validation and exploiting two BERT embedding models. Our system ranked first in this task, and improved the average F1 score across all participating teams by 19.07% with a precision, recall, and F1 on the test set of 83.75%, 87.01%, and 85.35% respectively.

The video of this talk cannot be embedded. You can watch it here:

https://underline.io/lecture/6453-ensemble-bert-for-classifying-medication-mentioning-tweets

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLING Workshops 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

08/12/2020

ISLab System for SMM4H Shared Task 2020

Chen-Kai Wang, Hong-Jie Dai, You-Chen Zhang and
Bo-Chun Xu, Bo-Hong Wang, You-Ning Xu, Po-Hao Chen, Chung-Hong Lee

Keywords Paper

0

0

0

0

11:56

07/06/2021

The Healthy States of America: Creating a Health Taxonomy with Social Media

Sanja Šćepanović, Luca Maria Aiello, Ke Zhou and
Sagar Joglekar, Daniele Quercia

Keywords Paper

Qualitative and quantitative studies of social media, Credibility of online content, Measuring predictability of real world phenomena based on social media, e.g., spanning politics, finance, and health

0

0

0

0

8:00

23/07/2020

Extracting medical entities from social media

Sanja Scepanovic, Enrique Martin-Lopez, Daniele Quercia, Khan Baykaner

Keywords Paper

Applied computing, Life and medical sciences, Health informatics, Computing methodologies, Artificial intelligence, Natural language processing

0

0

0

0

6:12

07/06/2020

Mining Archive.org’s Twitter Stream Grab for Pharmacovigilance Research Gold

Ramya Tekumalla, Javad Rafiei Asl, Juan M. Banda

Keywords Paper

building, learning, trends, tweets, twitter

0

0

0

0

3:07

08/12/2020

COVID-19 Twitter Monitor: Aggregating and Visualizing COVID-19 Related Trends in Social Media

Joseph Cornelius, Tilia Ellendorff, Lenz Furrer, Fabio Rinaldi

Keywords Paper

0

0

0

0

9:50

07/06/2020

Variation across Scales: Measurement Fidelity under Twitter Data Sampling

Siqi Wu, Marian-Andrei Rizoiu, Lexing Xie

Keywords Paper

attention, bias, cascades, changes, collection, graphs, influences, measures, networks, rates, structure, terms, tweets, twitter

0

0

0

0

9:59

25/07/2020

APS: An active PubMed search system for technology assisted reviews

Dan Li, Panagiotis Zafeiriadis, Evangelos Kanoulas

Keywords Paper

systematic reviews, PubMed, active search, TAR

0

0

0

0

7:40

25/07/2020

Proposal and comparison of health specific features for the automatic assessment of readability

Hélder Antunes, Carla Teixeira Lopes

Keywords Paper

natural language processing, machine learning, readability, consumer health search

0

0

0

0

9:34

16/11/2020

Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition

Yun He, Ziwei Zhu, Yin Zhang and
Qin Chen, James Caverlee

Keywords Paper

health-related tasks, consumer answering, medical inference, disease recognition

0

0

0

0

11:47

19/04/2021

Mega-COV: A billion-scale dataset of 100+ languages for COVID-19

Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi and
Dinesh Pabbi, Kunal Verma, Rannie Lin

Keywords Paper

0

0

0

0

12:58

07/06/2020

Pie Chart or Pizza: Identifying Chart Types and their Virality on Twitter

Pavlos Vougiouklis, Leslie Carr, Elena Simperl

Keywords Paper

classification, graphs, images, images shared, networks, predictions, shared, twitter, types

0

0

0

0

12:38

25/07/2020

What makes a top-performing precision medicine search engine? Tracing main system features in a systematic way

Erik Faessler, Michel Oleynik, Udo Hahn

Keywords Paper

precision medicine, trec, search engine evaluation

0

0

0

0

12:27

19/04/2021

Predicting treatment outcome from patient texts:the case of Internet-based cognitive behavioural therapy

Evangelia Gogoulou, Magnus Boman, Fehmi Ben Abdesslem and
Nils Hentati Isacsson, Viktor Kaldo, Magnus Sahlgren

Keywords Paper

0

0

0

0

5:17

07/08/2020

Knowledge Base Completion for Constructing Problem-Oriented Medical Records

James Mullenbach, Jordan Swartz, T. Greg McKelvey and
Hui Dai, David Sontag

Keywords Paper

0

0

0

0

3:05

01/07/2020

Deep Learning-based Online Alternative Product Recommendations at Scale

Mingming Guo, Nian Yan, Xiquan Cui and
San He Wu, Unaiza Ahsan, Rebecca West, Khalifeh Al Jadda

Keywords Paper

0

0

0

0

18:02

04/07/2020

Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup

Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea

Keywords Paper

Cross-Lingual Classification, Distinguishing messages, disaster management, multi-label tweets

0

0

0

0

12:49

04/07/2020

Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time

Benjamin Nye, Ani Nenkova, Iain Marshall, Byron C. Wallace

Keywords Paper

Mapping Evidence, automatic maps, Trialstreamer, evidence component

0

0

0

0

11:28

19/10/2020

AGATHA: Automatic graph mining and transformer based hypothesis generation approach

Justin Sybrandt, Ilya Tyagin, Michael Shtutman, Ilya Safro

Keywords Paper

hypothesis generation, transformer models, literature-based discovery, biomedical recommendation, semantic networks

0

0

0

0

8:20

04/07/2020

uBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems

Tsuta Yuma, Naoki Yoshinaga, Masashi Toyoda

Keywords Paper

Open-Domain Systems, uBLEU, Uncertainty-Aware Method, ΔBLEU

0

0

0

0

11:07

06/12/2021

SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes

Zhaozhi Qian, Yao Zhang, Ioana Bica and
Angela Wood, Mihaela van der Schaar

Keywords Paper

causality, interpretability

0

0

0

1

7:54

19/04/2021

BERT prescriptions to avoid unwanted headaches: A comparison of transformer architectures for adverse drug event detection

Beatrice Portelli, Edoardo Lenzi, Emmanuele Chersoni and
Giuseppe Serra, Enrico Santus

Keywords Paper

0

0

0

0

6:56

06/07/2020

Extending Unsupervised Neural Image Compression With Supervised Multitask Learning

David Tellez, Diederik Höppener, Cornelis Verhoef and
Dirk Grünhagen, Pieter Nierop, Michal Drozdzal, Jeroen Laak, Francesco Ciompi

Keywords Paper

0

0

0

0

12:21

25/07/2020

MGNN: A multimodal graph neural network for predicting the survival of cancer patients

Jianliang Gao, Tengfei Lyu, Fan Xiong and
Jianxin Wang, Weimao Ke, Zhao Li

Keywords Paper

multimodal, medical information retrieval, graph neural networks, cancer survival prediction

0

0

0

0

7:57

04/07/2020

SUPP.AI: finding evidence for supplement-drug interactions

Lucy Wang, Oyvind Tafjord, Arman Cohan and
Sarthak Jain, Sam Skjonsberg, Carissa Schoenick, Nick Botner, Waleed Ammar

Keywords Paper

browsing interactions, browsing SDIs, SDI identification, identifying interactions

0

0

0

0

12:12

01/07/2020

Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models

Seppo Enarvi, Marilisa Amoia, Miguel Del-Agua Teba and
Brian Delaney, Frank Diehl, Stefan Hahn, Kristina Harris, Liam McGrath, Yue Pan, Joel Pinto, Luca Rubini, Miguel Ruiz, Gagandeep Singh, Fabian Stemmer, Weiyi Sun, Paul Vozila, Thomas Lin, Ranjani Ramamurthy

Keywords Paper

0

0

0

0

8:14

02/02/2021

Project RISE: Recognizing Industrial Smoke Emissions

Yen-Chia Hsu, Ting-Hao (Kenneth) Huang, Ting-Yao Hu and
Paul Dille, Sean Prendi, Ryan Hoffman, Anastasia Tsuhlares, Jessica Pachuta, Randy Sargent, Illah Nourbakhsh

Keywords Paper

0

0

0

0

17:37

04/07/2020

Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation

Weixin Liang, James Zou, Zhou Yu

Keywords Paper

Automatic Evaluation, Open evaluation, dialog research, dialog evaluation

0

0

0

0

11:24

06/07/2020

PathologyGAN: Learning deep representations of cancer tissue

Adalberto Claudio Quiros, Roderick Murray-Smith, Ke Yuan

Keywords Paper

0

0

0

0

5:00

04/07/2020

Predicting the Topical Stance and Political Leaning of Media using Tweets

Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

Keywords Paper

supervised solutions, cascaded method, unsupervised learning, supervised learning

0

0

0

0

7:16

14/09/2020

Ada-Boundary: Accelerating DNN Training via Adaptive Boundary Batch Selection

Hwanjun Song, Sundong Kim, Minseok Kim, Jae-Gil Lee

Keywords Paper

0

0

0

0

10:52

23/07/2020

Disease state prediction from single-cell data using graph attention networks

Neal Ravindra, Arijit Sehanobish, Jenna L. Pappalardo and
David A. Hafler, David van Dijk

Keywords Paper

Applied computing, Life and medical sciences, Computational biology, Computational transcriptomics, Computing methodologies, Machine learning, Machine learning algorithms, Feature selection, Machine learning approaches, Neural networks

0

0

0

0

8:07

07/08/2020

Towards an Automated SOAP Note: Classifying Utterances from Medical Conversations

Benjamin Schloss, Sandeep Konam

Keywords Paper

0

0

0

0

2:57

07/06/2021

Classifying Reasonability in Retellings of Personal Events Shared on Social Media: A Preliminary Case Study with /r/AmITheAsshole

Ethan Haworth, Ted Grover, Justin Langston and
Ankush Patel, Joseph West, Alex C. Williams

Keywords Paper

Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior, Trend identification and tracking, time series forecasting, Measuring predictability of real world phenomena bas

0

0

0

0

5:11

23/07/2020

TASTE: temporal and static tensor factorization for phenotyping electronic health records

Ardavan Afshar, Ioakeim Perros, Haesun Park and
Christopher deFilippi, Xiaowei Yan, Walter Stewart, Joyce Ho, Jimeng Sun

Keywords Paper

Computing methodologies, Machine learning, Learning paradigms, Unsupervised learning, Dimensionality reduction and manifold learning

0

0

0

0

7:55

19/10/2020

ART (attractive recommendation tailor): How the diversity of product recommendations affects customer purchase preference in fashion industry?

Hyokmin Kwon, Jaeho Han, Kyungsik Han

Keywords Paper

diversity, feature engineering, preference modeling, fashion recommendation, large-scale user test

0

0

0

0

9:43

04/07/2020

Detecting Perceived Emotions in Hurricane Disasters

Shrey Desai, Cornelia Caragea, Junyi Jessy Li

Keywords Paper

Hurricane Disasters, Natural disasters, classification tasks, HurricaneEmo

0

0

0

0

11:16

23/08/2020

SimClusters: Community-based representations for heterogeneous recommendations at twitter

Venu Satuluri, Yao Wu, Xun Zheng and
Yilei Qian, Brian Wichers, Qieyun Dai, Gui Ming Tang, Jerry Jiang, Jimmy Lin

Keywords Paper

community detection, personalization, recommender systems

0

0

0

0

4:55

07/06/2021

Exercise? I thought you said 'Extra Fries’: Leveraging Sentence Demarcations and Multi-hop Attention for Meme Affect Analysis

Shraman Pramanick, Md Shad Akhtar, Tanmoy Chakraborty

Keywords Paper

Subjectivity in textual data, sentiment analysis, polarity/opinion identification and extraction, linguistic analyses of social media behavior

0

0

0

0

7:57

08/12/2020

Medical Knowledge-enriched Textual Entailment Framework

Shweta Yadav, Vishal Pallagani, Amit Sheth

Keywords Paper

0

0

0

0

8:23

06/12/2020

Guiding Deep Molecular Optimization with Genetic Exploration

Sungsoo Ahn, Junsu Kim, Hankook Lee, Jinwoo Shin

Keywords Paper

Algorithms -> Active Learning, Algorithms -> Bandit Algorithms

0

0

0

0

3:27