The utility of context when extracting entities from legal documents

19/10/2020

The utility of context when extracting entities from legal documents

Jonathan Donnelly, Adam Roegiest

Keywords: legal technology, machine learning, nlp

Abstract Paper Similar Papers

Abstract: When reviewing documents for legal tasks such as Mergers and Acquisitions, granular information (such as start dates and exit clauses) need to be identified and extracted. Inspired by previous work in Named Entity Recognition (NER), we investigate how NER techniques can be leveraged to aid lawyers in this review process. Due to the extremely low prevalence of target information in legal documents, we find that the traditional approach of tagging all sentences in a document is inferior, in both effectiveness and data required to train and predict, to using a first-pass layer to identify sentences that are likely to contain the relevant information and then running the more traditional sentence-level sequence tagging. Moreover, we find that such entity-level models can be improved by training on a balanced sample of relevant and non-relevant sentences. We additionally describe the use of our system in production and how its usage by clients means that deep learning architectures tend to be cost inefficient, especially with respect to the necessary time to train models.

The video of this talk cannot be embedded. You can watch it here:

https://dl.acm.org/doi/10.1145/3340531.3412746#sec-supp

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at CIKM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

De-Biased Court's View Generation with Causality

Yiquan Wu, Kun Kuang, Yating Zhang and
Xiaozhong Liu, Changlong Sun, Jun Xiao, Yueting Zhuang, Luo Si, Fei Wu

Keywords Paper

court generation, legal ai, automatic generation, fact description

0

0

0

0

9:31

06/12/2021

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

machine learning, fairness, language

0

0

0

0

13:17

06/12/2020

Intra-Processing Methods for Debiasing Neural Networks

Yash Savani, Colin White, Naveen Sundar Govindarajulu

Keywords Paper

0

0

0

0

3:22

02/06/2020

Hybrid Reasoning Over Large Knowledge Bases Using On-The-Fly Knowledge Extraction

Giorgos Stoilos, Damir Juric, Szymon Wartak and
Claudia Schulz, Mohammad Khodadadi

Keywords Paper

0

0

0

0

28:41

04/07/2020

Distinguish Confusing Law Articles for Legal Judgment Prediction

Nuo Xu, Pinghui Wang, Long Chen and
Li Pan, Xiaoyan Wang, Junzhou Zhao

Keywords Paper

Legal Prediction, judicial systems, handy services, LJP

0

0

0

0

11:59

26/04/2020

Pre-training Tasks for Embedding-based Large-scale Retrieval

Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang and
Yiming Yang, Sanjiv Kumar

Keywords Paper

natural language processing, large-scale retrieval, unsupervised representation learning, paragraph-level pre-training, two-tower Transformer models

0

0

0

1

4:39

02/02/2021

Judgment Prediction via Injecting Legal Knowledge into Neural Networks

Leilei Gan, Kun Kuang, Yi Yang, Fei Wu

Keywords Paper

0

0

0

0

14:39

12/07/2020

DeBayes: a Bayesian method for debiasing network embeddings

Maarten Buyl, Tijl De Bie

Keywords Paper

Fairness, Equity, Justice, and Safety

0

0

0

0

14:27

06/12/2020

Learning to summarize with human feedback

Nisan Stiennon, Long Ouyang, Jeffrey Wu and
Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano

Keywords Paper

0

0

0

0

3:17

26/04/2020

Generative Models for Effective ML on Private, Decentralized Datasets

Sean Augenstein, H. Brendan McMahan, Daniel Ramage and
Swaroop Ramaswamy, Peter Kairouz, Mingqing Chen, Rajiv Mathews, Blaise Aguera y Arcas

Keywords Paper

generative models, federated learning, decentralized learning, differential privacy, privacy, security, GAN

0

0

0

0

5:02

08/12/2020

Less is Better: A cognitively inspired unsupervised model for language segmentation

Jinbiao Yang, Stefan L. Frank, Antal van den Bosch

Keywords Paper

0

0

0

0

10:27

02/02/2021

Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection

Taesun Whang, Dongyub Lee, Dongsuk Oh and
Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Paper

0

0

0

0

17:37

19/10/2020

Fairness-aware learning with prejudice free representations

Ramanujam Madhavan, Mohit Wadhwa

Keywords Paper

fairness, prejudice, privacy, interpretability

0

0

0

0

7:04

16/11/2020

An Imitation Game for Learning Semantic Parsers from User Interaction

Ziyu Yao, Yiqi Tang, Wen-tau Yih and
Huan Sun, Yu Su

Keywords Paper

bootstrapping, fine-tuning parsers, theoretical analysis, text-to-sql problem

0

0

0

0

11:49

16/11/2020

Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)

Alex Warstadt, Yian Zhang, Xiaocheng Li and
Haokun Liu, Samuel R. Bowman

Keywords Paper

self-supervised tasks, language understanding, ambiguous tasks, finetuning

0

0

0

0

12:04

04/07/2020

Benefits of Intermediate Annotations in Reading Comprehension

Dheeru Dua, Sameer Singh, Matt Gardner

Keywords Paper

Reading Comprehension, data collection, data process, Intermediate Annotations

0

0

0

0

6:01

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

16/11/2020

Small but Mighty: New Benchmarks for Split and Rephrase

Li Zhang, Huaiyu Zhu, Siddhartha Brahma, Yunyao Li

Keywords Paper

text task, fine-grained evaluation, automatic process, rule-based model

0

0

0

0

6:58

16/11/2020

Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning

Hanlu Wu, Tengfei Ma, Lingfei Wu and
Tariro Manyumwa, Shouling Ji

Keywords Paper

summarization task, document system, rouge, unsupervised learning

0

0

0

0

11:16

05/01/2021

EvidentialMix: Learning With Combined Open-Set and Closed-Set Noisy Labels

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and
Ian Reid, Gustavo Carneiro

Keywords Paper

0

0

0

0

4:58

16/11/2020

A Diagnostic Study of Explainability Techniques for Text Classification

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Keywords Paper

downstream tasks, machine learning, explainability techniques, diverse techniques

0

0

0

0

11:24

04/07/2020

From Zero to Hero: Human-In-The-Loop Entity Linking in Low Resource Domains

Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych

Keywords Paper

Human-In-The-Loop Linking, Entity linking, disambiguating mentions, annotation process

0

0

0

0

12:26

04/07/2020

Low-Resource Generation of Multi-hop Reasoning Questions

Jianxing Yu, Wei Liu, Shuang Qiu and
Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Keywords Paper

Low-Resource Questions, generating questions, machine comprehension, multi-hop model

0

0

0

0

11:54

19/10/2020

Learning to profile: User meta-profile network for few-shot learning

Hao Gong, Qifang Zhao, Tianyu Li and
Derek Cho, DuyKhuong Nguyen

Keywords Paper

multi-task learning, multi-modal model, representation learning, meta-learning

0

0

0

1

12:10

02/02/2021

Memory-Augmented Image Captioning

Zhengcong Fei

Keywords Paper

0

0

0

0

16:31

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

06/12/2020

Dialog without Dialog Data: Learning Visual Dialog Agents from VQA Data

Michael Cogswell, Jiasen Lu, Rishabh Jain and
Stefan Lee, Devi Parikh, Dhruv Batra

Keywords Paper

1

0

0

0

3:29

04/07/2020

Unsupervised Opinion Summarization with Noising and Denoising

Reinald Kim Amplayo, Mirella Lapata

Keywords Paper

Unsupervised Summarization, supervised models, abstractive summarization, Noising

0

0

0

0

12:16

04/07/2020

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

Yada Pruksachatkun, Jason Phang, Haokun Liu and
Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

Intermediate-Task Learning, natural tasks, data-rich task, intermediate-task training

0

0

0

0

14:47

05/01/2021

Unsupervised Attention Based Instance Discriminative Learning for Person Re-Identification

Kshitij Nikhal, Benjamin S. Riggan

Keywords Paper

0

0

0

0

4:23

02/02/2021

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

Alexander Podolskiy, Dmitry Lipin, Andrey Bout and
Ekaterina Artemova, Irina Piontkovskaya

Keywords Paper

0

0

0

0

16:08

05/04/2021

Learning on Distributed Traces for Data Center Storage Systems

Giulio Zhou, Martin Maas

Keywords Paper

Deep Learning, Reinforcement Learning and Planning -> Navigation

0

0

0

0

21:12

05/04/2021

Learning on Distributed Traces for Data Center Storage Systems

Giulio Zhou, Martin Maas

Keywords Paper

Deep Learning, Reinforcement Learning and Planning -> Navigation

0

0

0

0

4:57

02/02/2021

Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation

Ieva Staliūnaitė, Philip John Gorinski, Ignacio Iacobacci

Keywords Paper

0

0

0

0

16:40

08/12/2020

Learning as Abduction: Trainable Natural Logic Theorem Prover for Natural Language Inference

Lasha Abzianidze

Keywords Paper

0

0

0

0

14:56

16/11/2020

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

response selection, retrieval-based systems, learning-to-rank problem, learning-to-rank

0

0

0

0

12:03

19/04/2021

Zero-shot neural passage retrieval via domain-targeted synthetic question generation

Ji Ma, Ivan Korotkov, Yinfei Yang and
Keith Hall, Ryan McDonald

Keywords Paper

0

0

0

0

12:47

23/08/2020

Diverse rule sets

Guangyi Zhang, Aristides Gionis

Keywords Paper

sampling, classifier, pattern mining, rule learning, diversification, rule sets

0

0

0

0

9:41

07/09/2020

BCaR: Beginner Classifier as Regularization Towards Generalizable Re-ID

Masato Tamura, Tomoaki Yoshinaga

Keywords Paper

person re-identification, generalizable, soft label, knowledge distillation, Re-ID, domain generalization

0

0

0

0

6:53