\u003Cfigcaption class=\"wp-element-caption\">\u003Cem>Architecture diagram depicting the three core components of the online course recommendation system: generating embeddings, making recommendations, and serving recommendations.\u003C/em>\u003C/figcaption>\u003C/figure>\n\u003C!-- /wp:image -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 class=\"wp-block-heading\" id=\"h-generating-embeddings\">Generating embeddings\u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>As mentioned earlier, the core of a content recommendation system is how similarity is defined. For us, it was computing sentence embeddings and performing a nearest neighbor search. Doing this for 23 million posts and millions of users each run is no small task. Because Databricks is built on PySpark, naturally we leaned heavily into the PySpark ecosystem, which allowed us to rapidly prototype on “small” amounts of data and scale with no code changes. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>For the one-time job of generating post embeddings, we wrapped the BERT model in a PySpark Pandas UDF that ran on a GPU cluster. Then on a regular cadence, we computed new and edited question embeddings. Each time the post text, embedding, and other meta-data were written to a feature store. This was the same approach we took to generate the course embeddings. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>The user embeddings had to be refreshed each time we wanted to make new recommendations to account for the most recent user activity. We set a minimum number of questions a user had to view within the lookup window, if they met the threshold, then they were eligible for a recommendation to be made. For eligible users, we pulled the post embeddings from the post feature store table, pooled them, and wrote them to a separate user feature store table. For pooling, we also leveraged a Pandas UDF as there were other pooling methods we wanted to try, like linear or exponential weighting. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 class=\"wp-block-heading\" id=\"h-making-recommendations\">Making recommendations\u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>With only a couple thousand courses, we loaded all the embeddings into a modified nearest neighbor model, which allowed us to log it directly to the MLflow Model Registry and track lineage to the content feature store. We only had to retrain this model if we added or removed courses to the catalog. Logging the model to the registry also meant there was a clear path to going real-time if we so chose, as we could deploy the model as a serverless API.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Whenever making the recommendations, the logged model was then pulled from the registry and predictions were made directly on a Spark DataFrame. The data was then written to an intermediary Delta table. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>For those familiar with recommender systems, you may notice that this is a retrieval-only system, and you are correct. The initial release of the system did not include a ranker model as we needed to collect labels to train a model to re-rank results based on predicted clickthrough rate or other business-driven metrics, not just similarity. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 class=\"wp-block-heading\" id=\"h-serving-recommendations\">Serving recommendations\u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Once all recommendations were computed and a series of quality checks were passed, the final results were then written to an Azure SQL Server database. This database served as a low enough latency database for our internal ad-serving platform to select from when it needs to serve a course recommendation.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Once our internal ad server was told to serve an online course ad, it first looked to see if the user was opted-in to targeted ads. If so, it would then check to see if a user-to-content recommendation was available. If not, it would use post-to-content, and if there was a failure or the ad was being served on a non-question page, then a top content recommendation was served. This logic can be seen in the below flowchart.\u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:image {\"id\":22221,\"sizeSlug\":\"large\",\"linkDestination\":\"none\"} -->\n\u003Cfigure class=\"wp-block-image size-large\">\u003Cimg src=\"https://stackoverflow.blog/wp-content/uploads/2023/05/Edtech-Blog-CLC-Flow-1-633x630.png\" alt=\"\" class=\"wp-image-22221\" />\u003Cfigcaption class=\"wp-element-caption\">\u003Cem>Flowchart of the recommendation type serving logic.\u003C/em>\u003C/figcaption>\u003C/figure>\n\u003C!-- /wp:image -->\n\n\u003C!-- wp:heading {\"level\":3} -->\n\u003Ch3 class=\"wp-block-heading\" id=\"h-deployment-and-experiment\">Deployment and experiment \u003C/h3>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>Each of the above three core components had various parameters that could be changed and configured. Each potentially affecting the final recommendation and the execution time significantly. What embedding model, lookback window, or pooling method was used? What cluster configuration, execution schedule, and batch size were used? To address this, we used Databricks’ CI/CD tool \u003Ca href=\"https://docs.databricks.com/dev-tools/dbx.html\">dbx\u003C/a> to parametrize the workflows and configurations, then had GitHub actions execute the builds. This allowed us to easily move from our development workspace to our production workspace and version changes, while always knowing how a recommendation was computed. \u003C/p>\n\u003C!-- /wp:paragraph -->\n\n\u003C!-- wp:heading -->\n\u003Ch2 class=\"wp-block-heading\" id=\"h-looking-forward\">Looking forward\u003C/h2>\n\u003C!-- /wp:heading -->\n\n\u003C!-- wp:paragraph -->\n\u003Cp>We achieved a lot during the course (no pun intended) of building our online course recommendation system. We went from having no data platform to having a feature-reach machine learning platform that has been handling millions of recommendations every week. It was a learning experience for everyone involved but, most of all, an exciting milestone for Stack Overflow’s AI/ML initiatives. We have already re-purposed some of the work to improve our Related Questions section, which saw a 155% increase in the clickthrough rate (look for a future blog post on this). We have a lot of other ideas and experiments we’re working on for other applications of AI/ML to improve our platforms.\u003C/p>\n\u003C!-- /wp:paragraph -->","html","2023-05-29T14:00:00.000Z",{"current":434},"more-on-our-ai-future-building-course-recommendations-and-a-new-data-platform",[436,445,451,456],{"_createdAt":437,"_id":438,"_rev":439,"_type":440,"_updatedAt":441,"slug":442,"title":444},"2023-05-23T16:43:21Z","wp-tagcat-ai","fpDTFQqIDjNJIbHDKPBGpV","blogTag","2025-01-30T16:19:01Z",{"current":443},"ai","AI",{"_createdAt":437,"_id":446,"_rev":447,"_type":440,"_updatedAt":437,"slug":448,"title":450},"wp-tagcat-code-for-a-living","9HpbCsT2tq0xwozQfkc4ih",{"current":449},"code-for-a-living","Code for a Living",{"_createdAt":437,"_id":452,"_rev":447,"_type":440,"_updatedAt":437,"slug":453,"title":455},"wp-tagcat-deep-dive",{"current":454},"deep-dive","deep dive",{"_createdAt":437,"_id":457,"_rev":447,"_type":440,"_updatedAt":437,"slug":458,"title":460},"wp-tagcat-recommendation-engines",{"current":459},"recommendation-engines","recommendation engines","More on our AI future: building course recommendations and a new data platform",[463,469,475,481],{"_id":464,"publishedAt":465,"slug":466,"sponsored":12,"title":468},"370eca08-3da8-4a13-b71e-5ab04e7d1f8b","2025-08-28T16:00:00.000Z",{"_type":10,"current":467},"moving-the-public-stack-overflow-sites-to-the-cloud-part-1","Moving the public Stack Overflow sites to the cloud: Part 1",{"_id":470,"publishedAt":471,"slug":472,"sponsored":423,"title":474},"e10457b6-a9f6-4aa9-90f2-d9e04eb77b7c","2025-08-27T04:40:00.000Z",{"_type":10,"current":473},"from-punch-cards-to-prompts-a-history-of-how-software-got-better","From punch cards to prompts: a history of how software got better",{"_id":476,"publishedAt":477,"slug":478,"sponsored":12,"title":480},"65472515-0b62-40d1-8b79-a62bdd2f508a","2025-08-25T16:00:00.000Z",{"_type":10,"current":479},"making-continuous-learning-work-at-work","Making continuous learning work at work",{"_id":482,"publishedAt":483,"slug":484,"sponsored":12,"title":486},"1b0bdf8c-5558-4631-80ca-40cb8e54b571","2025-08-21T14:00:25.054Z",{"_type":10,"current":485},"research-roadmap-update-august-2025","Research roadmap update, August 2025",{"count":488,"lastTimestamp":489},3,"2024-02-08T18:00:23Z",["Reactive",491],{"$sarticleModal":492},false,["Set"],["ShallowReactive",495],{"sanity-5U9fcSC19_OHyhz--_jtRME-Bcg-Rtbe3SFmtCtESbY":-1,"sanity-comment-wp-post-22205-1756387910249":-1},"/2023/05/29/more-on-our-ai-future-building-course-recommendations-and-a-new-data-platform"]