Friday, June 7, 2019 in *Nicollet D1* | |
09:00–09:15 | Opening Remarks |
09:15–10:30 | Invited Speaker
(slides)
Heng Ji (University of Illinois at Urbana-Champaign) |
10:30–11:20 | Coffee Break |
11:20–12:30 | Oral Session 1 |
11:20–11:40 | Effective Feature Representation for Clinical Text Concept Extraction Yifeng Tao, Bruno Godefroy, Guillaume Genthial and Christopher Potts [pdf] |
11:40–11:55 | An Analysis of Attention over Clinical Notes for Predictive Tasks Sarthak Jain, Ramin Mohammadi and Byron C. Wallace [pdf] |
11:55–12:10 | Extracting Adverse Drug Event Information with Minimal Engineering Timothy Miller, Alon Geva and Dmitriy Dligach [pdf] |
12:10–12:25 | Hierarchical Nested Named Entity Recognition Zita Marinho, Afonso Mendes, Sebastião Miranda and David Nogueira [pdf] |
12:30–14:00 | Lunch |
14:00–15:30 | Oral Session 2 |
14:00–14:20 | Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models Oren Melamud and Chaitanya Shivade [pdf] |
14:20–14:40 | A Novel System for Extractive Clinical Note Summarization using EHR Data Jennifer Liang, Ching-Huei Tsou and Ananya Poddar [pdf] |
14:40–14:55 | Study of lexical aspect in the French medical language. Development of a lexical resource Agathe Pierson and Cédrick Fairon [pdf] |
14:55–15:10 | A BERT-based Universal Model for Both Within- and Cross-sentence Clinical Temporal Relation Extraction Chen Lin, Timothy Miller, Dmitriy Dligach, Steven Bethard and Guergana Savova [pdf] |
15:10–15:25 | Publicly Available Clinical BERT Embeddings Emily Alsentzer, John Murphy, William Boag, Wei-Hung Weng, Di Jindi, Tristan Naumann and Matthew McDermott [pdf] |
15:30–16:00 | Coffee Break |
16:00–16:45 | Poster Session |
A General-Purpose Annotation Model for Knowledge Discovery: Case Study in Spanish Clinical Text Alejandro Piad-Morffis, Yoan Guitérrez, Suilan Estevez-Velarde and Rafael Muñoz [pdf] | |
Predicting ICU transfers using text messages between nurses and doctors Faiza Khan Khattak, Chloe Pou-Prom, Robert Wu and Frank Rudzicz [pdf] | |
Medical Entity Linking using Triplet Network Ishani Mondal, Sukannya Purkayastha, Sudeshna Sarkar, Pawan Goyal, Jitesh Pillai, Amitava Bhattacharyya and Mahanandeeshwar Gattu [pdf] | |
Annotating and Characterizing Clinical Sentences with Explicit Why-QA Cues Jungwei Fan [pdf] | |
Extracting Factual Min/Max Age Information from Clinical Trial Studies Yufang Hou, Debasis Ganguly, Lea Deleris and Francesca Bonin [pdf] | |
Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records Eben Holderness, Philip Cawkwell, Kirsten Bolton, James Pustejovsky and Mei-Hua Hall [pdf] | |
Medical Word Embeddings for Spanish: Development and Evaluation Felipe Soares, Marta Villegas, Aitor Gonzalez-Agirre, Martin Krallinger and Jordi Armengol-Estapeé [pdf] | |
Attention Neural Model for Temporal Relation Extraction Sijia Liu, Liwei Wang, Vipin Chaudhary and Hongfang Liu [pdf] | |
Automatically Generating Psychiatric Case Notes From Digital Transcripts of Doctor-Patient Conversations Nazmul Kazi and Indika Kahanda [pdf] | |
Clinical Data Classification using Conditional Random Fields and Neural Parsing for Morphologically Rich Languages Razieh Ehsani, Tyko Niemi, Gaurav Khullar and Tiina Leivo [pdf] | |
16:45–17:30 | Panel Discussion
Hongfang Liu (Mayo Clinic) Piet de Groen (University of Minnesota) Elmer Bernstam (University of Texas Health Science Center, Houston) Alistair Johnson (MIT Laboratory for Computational Physiology) |
Heng Ji (University of Illinois at Urbana-Champaign)
Enhancing Quality and Robustness of Biomedical Information Extraction
(slides)
Extracting information from unstructured texts has a big impact on the biomedical domain, which can potentially tackle problems from disease diagnosis, drug discovery, to precision medicine. In this talk I'll present our recent progress on improving the quality and robustness of biomedical information extraction (IE).
Our first goal is to improve the quality of extraction from a formal genre - scientific literature. IE for the biomedical domain is general more challenging than that in the general news domain since it requires broader acquisition of domain-specific knowledge and deeper understanding of complex contexts. To better encode contextual information and external background knowledge, we propose a novel knowledge base (KB)-driven tree-structured long short-term memory networks (Tree-LSTM) framework, and a graph convolutional networks model, incorporating two new types of features: (1) dependency structures to capture wide contexts; (2) entity properties (types and category descriptions) from external ontologies via entity linking. This framework achieves state-of-the-art performanceon Drug-Drug Interaction Relation Extraction and the BioNLP shared task on Event Extractoin with Genia dataset [Li et al., NAACL2019].
However, all of these current supervised deep learning models are not robust when moving to a new genre. We observe the performance of our framework significantly degrades when we move to new informal genres such as clinical notes from the i2b2 task. In fact, we face similar challenges all the time, when we move an IE system to a new genre, domain, topic, scenario, or language. One major reason lies in the improper way of using word embeddings in the DNN model. The quality of word embeddings is not consistent throughout the vocabulary due to the long-tail distribution of word frequency. Without sufficient contexts, rare word embeddings are usually less reliable than those of common words. This issue is particularly important for clinical notes which are often full of abbreviations and informal variants of entity mentions. To address this issue, we guide the model to dynamically select and compose features using explicit reliability signals (including word frequency) that inform the model of the quality of each word embedding [Lin et al., ACL2019].
Heng Ji is Edward P. Hamilton Development Chair Professor at Rensselaer Polytechnic Institute. She will join Computer Science Department of University of Illinois at Urbana-Champaign as a tenured full professor in August 2019. She received her B.A. and M. A. in Computational Linguistics from Tsinghua University, and her M.S. and Ph.D. in Computer Science from New York University. Her research interests focus on Natural Language Processing, especially on Information Extraction (IE) and Knowledge Base Population. She is selected as "Young Scientist" and a member of the Global Future Council on the Future of Computing by the World Economic Forum in 2016 and 2017. She received "AI's 10 to Watch" Award by IEEE Intelligent Systems in 2013, NSF CAREER award in 2009, faculty awards from Google, IBM, Bosch and Tencent, PACLIC2012 Best Paper Runner-up, ACL2019 Best Demo Nomination, "Best of SDM2013" paper, and "Best of ICDM2013" paper. She has coordinated the NIST TAC Knowledge Base Population task since 2010, and served as the Program Committee Co-Chair of NAACL-HLT2018. She is the associate editor for IEEE/ACM Transaction on Audio, Speech, and Language Processing.
Poster boards are 8ft wide by 4ft high and will be in set up in the *Hyatt Exhibit Hall*. Posters should be put up a half hour or so before the scheduled poster session, and cannot be left up all day.
Screens will have an aspect ratio of 16:9.