PROJECTS

I have been learning constantly on the job since finishing my MS in Data Science in 2021.

 
 

Professional Experience

In my time in industry, I’ve worked with numerous stakeholders to define project requirements, optimized SQL queries to save time, designed reports and visualizations in both Oracle Analytics Cloud and Tableau, and analyzed many, many rows of data.

MSDS Capstone: Crohn’s Disease

During the year-long MSDS program, my capstone group worked with the UVA Gut Intelligence Lab to classifies MRE images (or image sets) as either exhibiting Crohn’s Disease or not using a convolutional neural net (CNN) built on Pytorch and trained using transfer learning. Our findings were presented at the 2021 IEEE SIEDS Conference and are available here.

Is Travis McElroy Cheating?

In industry, Bayesian machine learning techniques are used to dig up fraud in many different scenarios such as standardized testing and credit card transactions. Using a toy example (whether or not podcast host Travis McElroy’s dice rolls were truly random), this project explores fraud detection through a variety of approaches. Using Python, we extracted and cleaned these data points from a real-world dataset and compared to randomly generated unweighted and weighted dice rolls using data visualization in Seaborn, Naive Bayes and logistic regression in PyMC3. More exploration of the topic is warranted, specifically in extracting more data from transcripts of episodes in later seasons. Code here

Outlier Detection

This Medium article was written as a final group project for Data Mining, and explores the math and code behind several techniques to detect outliers using mclust, stats, and tidyverse to import, explore, and analyze the data in R.

COVID Data Exploration

Used NumPy, Pandas, BeautifulSoup, and Plotly Express to explore and make interactive visualizations of missingness and other factors to find what influenced the death rate per thousand in the Our World In Data dataset. Code here