User embeddings

Built user embeddings using Collaborative Filtering and user history. Designed pipeline to resolve cold start problem. Proved predictive value in recommendation models.

2021-2022Link

User interests

Aggregates content labels to user level. Project included filtering NSFW, grouping labels, and decaying. I designed and implemented. Batch feature was built with Airflow scheduler calling BigQuery scripts. Streaming feature was built with Flink. Further built User-to-Subreddit mapping using Annoy approximate nearest neighbors.

2021-2022Link

Subreddit depth

Designed and implemented bespoke Markov chain approach to compute average time-to-discover for subreddits. Optimized and parallelized expensive matrix computation for >99% speed up.

2021-2022Link

Brand safety analysis

Built analytics dashboard. Helped change serving pattern to increase ad slots ~8%.

2021-2022Link

User covariates

Covariates are user variables that we control for when analyzing impact of A/B tests. I identified impactful covariates and wrote script to compute these.

2021-2022Link