![]() |
![]() |
Engineered scalable user profiling pipeline that aggregated content labels to the user level—incorporating NSFW filtering, label grouping, and temporal decay—delivered via batch (Airflow/BigQuery) and streaming (Flink) systems, with downstream user-to-subreddit mappings powered by approximate nearest neighbors.
2021-2022Designed and implemented a channel recommendation model to identify the most relevant YouTube channels for a brand based on campaign keywords and URL-derived context. The approach leveraged shared embedding spaces and novel clustering techniques to account for multimodal channel content, paired with a two-stage ranking system optimized for real-time querying at scale. This system reduced channel selection time by ~90%, reproduced expert decisions with >99% precision, and was patented.
2018-2021Developed a UI-driven pipeline to streamline and automate video review workflows.
2018-2021I wrote a program to easily train a continuous-state Mealy machine and perform inference with Monte Carlo simulations.
2024LinkProject Titan is a set of software I wrote to make predictions for sports. This is a large project with web scraping, modeling, and architecture components.
2023-2022LinkThis is a now-defunct website I’ve made to track predictions made for NHL games, by experts on the internet. The front-end is built with Angular, which uses a PHP handler to access the MySQL database, which gets populated with a library of Python scrapers.
2020For a project, I wanted to arrange data in a table-like, object, but with the requirement: Cells could depend on other cells, with the dependency being any function (to be provided). When I update the cells, I want the children to update as well. As well the data is dynamically saved and loaded, so that the entire table isn't held in memory at once. The link contains a description of the project and design decisions I made; it contains a link to the code on GitHub.
2019Link