Home Projects CV Contact Me

Selected Projects

This page contains work samples from my post-secondary studies and personal projects I have completed since graduation. These include research papers, coding competitions, and useful programming modules.

Project Image

Overview

For my MDS capstone requirement, I conducted an independent study on the polygenic risk scoring (PRS) of autism under the supervision of Dr. Jonathan Terhorst. This study lasted from May 2025 to August 2025. Although scientists have made great progress in identifying genes that cause autism through the Human Genome Diversity Project, the vast majority of samples in the HGDP are of European ancestry. My project evaluated the portability of the PRS for autism across underreperesented populations.

Useful Links

GitHub Proposal Report

Tech Stack

Python R Bash PLINK (1.9) Bioconductor bcftools Galaxy server Linux/UNIX Great Lakes HPC Cluster

Overview

I developed a direct spatial interpolation method for solving a set of 25-dimensional coordinates given 26+ Euclidean distances to known data points. Although this method could be used for any multi-dimensional data, our application focuses specifically on PCA-derived genetic coordinates. Global25 (G25) is a coordinate system created from the EIGENSOFT and smartpca software and can be found here.

I collaborated with GitHub user AimSmall37, who shares a common interest in population genetics and genealogy. He implemented the multilateral interpolation mode and created the HTML for the app. I proposed the direct linear solver method, which he integrated into the web app.

Use Cases for the Coordinate Estimator:

1. Creating synthetic data

2. Estimating Global25 coordinates for a deceased or untested relative

3. Studying population structures and identifying patterns

View Application View Python Script

Tech Stack

Python NumPy (Linear Algebra) CI/CD Version Control Git HTML JavaScript

Project Image

Report

Tech Stack

Python econML R-Learner Double Machine Learning Histogram Gradient Boosting (scikit-learn) SHAP Google Colab

Project Image

Report

Tech Stack

XGBoost SVM Random Forest Histogram Gradient Boosting R Markdown

Project Image

GitHub link
Interactive App

Tech Stack

Ensemble Learning scikit-learn SVM MLOPS Streamlit Jupyter

Project Image

GitHub link

Tech Stack

Text-Based Inference LLMs QWEN-4B Google Colab Unsloth Huggingface

Project Image

GitHub link

Tech Stack

RStudio Git CI/CD Version Control Software Development

Project Image

GitHub link

Tech Stack

RStudio Swirl/Swirlify CI/CD Version Control Data Analysis