Some blog posts
Today I’ll talk about boredom. For instance, who reads corporate annual reports without a sense of wasted time? Who does read annual reports with genuine pleasure? Don’t get me wrong, I’m not talking about litterary pleasure, but more generally, intellectual pleasure or a sentiment of work accomplishment. In this planet, who did experience a strong sense of usefulness reading an annual report? Reading annual report is the corporate equivalent of slowly watching paint dry – if the paint was made of accounting jargon and regulatory compliance. The thing is, they’re of paramount importance for a plethora of actors. They provide insights for multiple, crucial decisions.
AI, especially with the LLM behemoths, seems relevant to end this endless boredom, by a deep understanding of the underlying language. But as we all know now, GenAI comes with hallucination. So, RAG (retrieval augmented generation) comes to the rescue (an intro here with Amazon Bedrock Knowledge Bases) and ground LLMs with factuality, acting as a “cheatsheet”, feeding updated data to the LLM, so that LLM doesn’t state that, for instance, Georges V is the current king of United Kingdom.
RAG has therefore become a proverbial framework in today’s generative AI era. But how can we ensure that:
We are going to focus on the “R” part of RAG -a.k.a retriever. Retrievers are in charge of forming the cheatsheet. From a cost perspective, the challenge of reducing distractors from retrievers will improve your RAG system, because cheatsheet will contain noise reduced data. It is a cost opportunity, as a better retriever system will likely feed a lesser amount of tokens to the LLM.
This is where embeddings come into play. If you’re not familiar, please stop reading this series and go here first. Now, off-the-shelf embeddings, either open-source or proprietary, have been commoditized. This time, we are going to adjust off-the-shelf embeddings with a technique called “adapters”. Adapters are learned processes that don’t touch the embedding per se, but rather transform an embedding of dimension d into another embedding of dimension d. They can be expressed as matrices. For instance, if we have a proprietary embedding X, then we just have to learn a matrix of dimension (d,d) that transform X into another embedding Y, such as Y=AX.
And yes, indeed, it can be as simple as a simple matrix. And, yes, indeed, we don’t touch to the initial embedding.
At the end of the journey, we’ll be able to maximize relevant documents and minimize distractions to a RAG system specialized for 10-K! Picture the system we’re going to build as a smart cat handling dozens of documents for us.
Here is how we are going to proceed.
Just with a important but slight detour, I’ve also written a companion repository, called rag-adapters where all the parts will be tied together inside a notebook.
Let’s see the menu for today, lots to uncover between practice and concepts.
We’re going to learn a lot in this journey! Who knew that understanding annual reports can be fun?
Notes:The information provided in this series is for educational purposes only and should not be considered as financial or investment advice. Please read our full Investment Disclaimer for more details.