Eric Lehman
Head of Clinical NLP

Eric Lehman — About Me

My research focuses on the practical deployment of large language models (LLMs) in healthcare settings. I am passionate about understanding how to safely deploy these incredible models to augment clinical decision-making. Specifically, I aim to answer questions such as:

  • How can we effectively build accurate LLM-based systems?
  • How can we ensure that these models are safe and fair?
  • What does it mean for these models to be trustworthy?

To address these questions, I currently serve as the Head of Clinical NLP at OpenEvidence, where I oversee OpenEvidence + ClinicalKey AI, a retrieval augmented generation (RAG) tool that is already being used in several hospital systems! I recently graduated from MIT in May of 2024, where I was advised by the amazing Peter Szolovits. Prior to that, I worked with the fantastic Byron Wallace at Northeastern University.


  • Ph.D. in Computer Science
    Massachusetts Institute of Technology, 2024
  • M.S. in Computer Science
    Massachusetts Institute of Technology, 2022
  • B.S. in Computer Science
    Northeastern University, 2016


Ph.D. Defense Completed

May 1st, 2024 — Successfully defended my Ph.D. thesis on "Practical Considerations For the Deployment of Clinical NLP Systems" at MIT, advised by the incredible Peter Szolovits!

Best Paper Award at CHIL 2023

June 23rd, 2023 — We won best paper at CHIL 2023.

Joined OpenEvidence as Head of Clinical NLP

July 1, 2022 — Excited to announce that I've joined OpenEvidence to lead their Clinical NLP research team.

Selected Publications

Travis Zack*, Eric Lehman*, Mirac Suzgun, Jorge A Rodriguez, Leo Anthony Celi, Judy Gichoya, Dan Jurafsky, Peter Szolovits, David W Bates, Raja-Elie E Abdulnour, Atul Butte, Emily Alsentzer. Assessing the Potential of GPT-4 To Perpetuate Racial and Gender Biases in Health Care: A Model Evaluation Study, Lancet Digital Health, 2023.

Eric Lehman, Evan Hernandez, Diwakar Mahajan, Jonas Wulff, Micah J Smith, Zachary Ziegler, Daniel Nadler, Peter Szolovits, Alistair Johnson, Emily Alsentzer. Do We Still Need Clinical Language Models?, Conference on Health, Inference, and Learning (CHIL), 2023.

Eric Lehman*, Sarthak Jain*, Karl Pichotta, Yoav Goldberg, Byron C. Wallace." Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?, North American Chapter of the Association for Computational Linguistics (NAACL), 2021.

Eric Lehman, Jay DeYoung, Regina Barzilay, Byron C. Wallace." Inferring Which Medical Treatments Work from Reports of Clinical Trials, North American Chapter of the Association for Computational Linguistics (NAACL), 2019.


Summer Research Mentor

Summer 2023, MIT, Mentor
I worked closely with a Summer research intern on exploring and collecting bias datasets for testing LLMs in healthcare settings. This was also an incredibly fun learning experience. Thank you Weston!

Machine Learning for Healthcare

Spring 2023, MIT, Teaching Assistant
A very fun graduate level course on machine learning in healthcare. I designed the syllabus, graded homeworks, held and planned recitations, wrote the final exam, and held weekly office hours. It was an incredible experience to teach this course!

Fundamentals of Computer Science 1

Spring 2017, Northeastern University, Tutor
Northeastern's introduction to computer science course. I held weekly office hours, graded homeworks, and helped students during recitation.