User embeddings

Designed and implemented a collaborative-filtering–based user embedding framework with streaming updates to mitigate cold start, enabling effective use in large-scale recommendation and advertising models.

2021-2022

User interests

Engineered scalable user profiling pipeline that aggregated content labels to the user level—incorporating NSFW filtering, label grouping, and temporal decay—delivered via batch (Airflow/BigQuery) and streaming (Flink) systems, with downstream user-to-subreddit mappings powered by approximate nearest neighbors.

2021-2022

Subreddit depth

Built bespoke Markov chain approach to compute average time-to-discover for subreddits. Optimized and parallelized expensive matrix computation for >99% speed up.

2021-2022

Brand safety analysis

Built an analytics dashboard and informed serving pattern changes that increased available ad slots by ~8%.

2021-2022

User covariates

Identified key user-level covariates and built tooling to compute them, improving the rigor and interpretability of A/B test impact analyses.

2021-2022