Scalable Systems for Knowledge Discovery

Benjamin Van Durme, Johns Hopkins University
Host: David Yarowsky

Speaker Biography

Benjamin Van Durme is an Assistant Research Professor in Computer Science, and the head of the Natural Language Understanding effort at the Human Language Technology Center of Excellence. His research focuses on methods for extracting implicit and explicit knowledge from human communication. His work ranges from computational natural language semantics to applications of streaming and randomized algorithms for large scale data mining. He is concerned with building large, composable systems, lately focused around the development and use of Concrete, a Thrift-backed schema for constructing HLT system pipelines, with supporting libraries in various languages (http://hltcoe.github.io). This platform grew out of the JHU HLTCOE SCALE 9-10 week Summer workshops he co-led in 2012 (25+ participants) and led in 2013 (40+ participants). He is the co-developer of the largest collection of paraphrases in the world (http://paraphrase.org), the largest released resource of text processed for extraction (decades of articles from newspapers such as the NYTimes), and the fastest system in the world for searching large collections of audio for matching keywords. He has degrees from the University of Rochester (BS, BA ‘01, MS ‘06, PhD ‘09) and Carnegie Mellon University (MS ‘04) in Computer Science, Cognitive Science, Linguistics and Language Technologies. He has performed internships at NIST, BAE Systems, Lockheed Martin’s Advanced Technology Laboratory AI Divison (1yr full-time), and Google Research (2 Summers). His research is supported by the NSF, DARPA, Vulcan, and recently by a Google Faculty Award.