A brief summary of language model finetuning
Here's a (brief) summary of language model finetuning, the various approaches that exist, their purposes, and what we know about how they work.
I’m currently the Director of AI at Rebuy, a personalized search and recommendations platform for D2C e-commerce brands. Prior to Rebuy, I was a Research Scientist at Alegion. Additionally, I worked for Salesforce Commerce Cloud for two years.
Here's a (brief) summary of language model finetuning, the various approaches that exist, their purposes, and what we know about how they work.
Masked self-attention is the key building block that allows LLMs to learn rich relationships and patterns between the words of a sentence. Let’s build it together from scratch.
The decoder-only transformer architecture is one of the most fundamental ideas in AI research.
Retrieval-augmented generation (RAG) is one of the best (and easiest) ways to specialize an LLM over your own data, but successfully applying RAG in practice involves more than just stitching together pretrained models.
Here’s a simple, three-part framework that explains generative language models.
This new LLM technique has started improving the results of models without additional training.