Biostatistics Seminar Series

“Automating evidence synthesis via machine learning and natural language processing”

Byron Wallace, PhD

Assistant Professor, Department of Information, Department of Computer Science

University of Texas at Austin

10/19/2015 ~3:30pm
Room 245, 121 South Main Street, Providence
Refreshments beginning at 3:15pm


Evidence-based medicine (EBM) looks to inform patient  care with the totality of available relevant evidence. Systematic reviews are the cornerstone of EBM and are critical to modern healthcare, informing everything from national health policy to bedside decision-making. But conducting systematic reviews is extremely laborious  (and hence expensive): producing a single review requires thousands of person-hours. Moreover, the exponential

expansion of the biomedical literature base has imposed an unprecedented burden on reviewers, thus multiplying costs. Researchers can no longer keep up with the primary literature, and this hinders the practice of evidence-based care.


I will discuss past and recent advances in machine learning and natural language processing methods that look to optimize the practice of EBM and thus mitigate the burden on reviewers. I will also describe emerging  work that explores the use of hybrid crowd-sourced/machine learning systems to realize efficiency gains. More specifically, the tasks these methods address include semi-automating evidence identification (i.e., citation screening) and automating the extraction of structured data from full-text published articles describing clinical trials. As I will discuss, these problems pose challenging problems from a machine learning vantage point, and hence motivate the development of novel approaches. I will present evaluations of these methods in the context of EBM and propose new directions moving forward toward automating evidence synthesis.