My Work
My work currently focuses on the collection and analysis of large healthcare databases, including clinical, genomic, and operational data. I especially enjoy developing software and designing systems to accelerate this work.
I also specialize in visualizing and communicating insights from complex data, to both interdisciplinary groups of stakeholders and non-experts.
I am a proponent of open access and reproducibility in research.
My Background
I’m a data scientist, epidemiologist, and software engineer. I have more than a decade of experience building software, designing and implementing studies, and analyzing data for healthcare-related research.
I’m currently an Assistant Professor of Precision Health at Geisinger Research where I focus on risk prediction, communicating and visualizing complex information, and applications of clinical informatics and bioinformatics. I previously led the data science team at a data-drive healthcare software startup.
My PhD is from University of Maryland Baltimore’s epidemiology department. (Epidemiology is a generalist field related to public health research; my training included statistics, study design, survey methods, and causal inference.) My dissertation involved UX research and data presentation optimization for public health information.
Please see my CV for more on my background and experience.
Current Research
My current research includes:
- Predicting germline variant pathogenicity using large genomic and clinical data sources.
- Automated methods for abstracting clinical data from electronic medical records.
Past Research
Some of my past work includes:
The
comet
software suite for collecting discrete choice experiment data in low-connectivity settings.I was the primary architect and author of an iOS application, server software, and a JavaScript framework used for distributing survey instruments, collecting data in the field, and securely transmitting it back to a central server. This suite of software has been used successfully in multiple R01 studies in multiple low- and middle-income countries.
This work was done in collaboration with Jan Ostermann, PhD at the University of South Carolina. It is not publicly accessible, but if you would like more information please contact me.
My dissertation, titled Assessing and Improving Patient Understanding of Publicly Reported Healthcare-Associated Infection-Related Hospital Quality Measures.
Open Source Software
sinatra-contact-form
A quick, open-source replacement for a service like FormKeep, which can run for free using Heroku and Postmark.sasfix
A tool for fixing the formatting of SAS output so it can be used in presentations, emailed, etc. without funny characters or unnecessary white space.
Documentation & Teaching
Personal Knowledge Base
This is my “outboard brain” for structured notes on both technical and more mundane topics.Some of my favorite pages include my R and SciPy reference notes, my exhaustive analysis of Mac and iOS notetaking apps, and my list of resources for learning
git
.Tech Notes Blog
I post one-offs that are super specific or don’t fit cleanly in my knowledge base to my Tech Notes blog.For example:
Thoughts on Reference Management Software
What reference management software should you use? I wrote the first version in 2015 and have updated it constantly since then.Organizing Data Analysis Projects
Best practices for organizing data analysis code, data, and other related documents.Data Presentation Tips
Best practices for presenting data, including examples and links to reference materials.SQL Joins Explained
A screencast and accompanying written explanation of how the different kinds of SQL joins work.Tools for Epidemiologists
A curated list of online resources and software for epidemiologists.
Retired Projects
Beautify
A rubygem that makes it easier to output pretty tables with Stata. Useful but not for the faint of heart.Pitchfork music reviews + Rdio mashup
An easy way to see what new music is available on Rdio and how good it is (according to Pitchfork).The Survey Software Review
A systematic, independent analysis of online survey software for researchers.Combine
A quick, open source app for accepting credit card payments for invoices online (read more). No longer maintained because Harvest now has similar functionality built-in.