No, I don't have anything to say about Sam Altman and OpenAI 🤷🏽♂️
Plus Chain of Notes, awesome events this week, a repo of experiments with GPT-4V, and news headlines
What’s up everyone!
The LangChain Zoomcamp has officially concluded!
Thanks for reading The Generative Generation! Subscribe for free to receive new posts and support my work.
Thanks to everyone who showed up for our weekly sessions and everyone who has been watching the recordings. I hope you enjoyed the series and that it helped kickstart your exploration of LangChain. 🦙 I’ll have more details regarding the LlamaIndex Zoomcamp in the new year.
In this final session, I talked about prompt management and quickly showcased how we can use LangSmith to manage prompts.
I’ve been publishing a lot of content on LangChain through Medium. Follow me there to catch up on previous publications and stay updated on new content.
🗓️ Upcoming Community Events
Nov 20 - How to Chat with Image Data Using New GPT-4 Vision API: Discover how GPT-4's Vision API transforms image data analysis with AI. Join a live demo and code-sharing session covering integration, strategies, and best practices.
Here’s an example of the type of demo you’ll learn to build 👇🏽
Nov 20 - How to Fine-tune a Base LLM for Retrieval Augmented Generation (RAG): In a webinar, Deci and Ai Bloks will demonstrate the integration of Deci LM-6B LLM - fine-tuned for RAG - into a RAG workflow using Ai Bloks' open-source library, llmware. The webinar will focus on use cases in financial services, legal and compliance. Be ready for a hands-on experience with code samples. Highlight every component of this state-of-the-art open-source RAG system and show how to customize it for your workflows.
🤖 Nov 22 - Agents: LangChain vs. OpenAI Assistants: Learn to develop complex LLM apps at an event by my mentor, Chris Alexiuk! LangChain develops reasoning apps using Chain-of-Thought for LLM. Combined with ReAct, it creates complex LLM apps. OpenAI's Assistants API simplifies creating agent-like apps. Ideal for LLM Ops practitioners and builders wanting to develop agent-like systems.
✨ Blog of the Week: Evaluating Contextual Retrieval in Large Language Models
This week’s pick is a blog post titled "Evaluating Contextual Retrieval in Large Language Models" by my friends Juan Olano, Chris Alexiuk, and James Tolton.
The blog post begins by acknowledging a common challenge in large language models (LLMs): LLMs, including extended context versions like GPT-4-Turbo 128K, often struggle when critical information is nestled in the middle of lengthy contexts. They reference a famous paper, “Lost in the Middle: How Language Models Use Long Contexts,” which shows LLMs perform best with information at the start or end of inputs but struggle with details embedded in the middle.
The authors expressed reservations about the existing tests, noting that the Attention mechanism in language models tends to blur or average out the context. They recognized that hiding a small sentence in a large context (ranging from 70K to 120K tokens) might not be the most effective way to test the model's retrieval capabilities.
Thus, they formulated a hypothesis to guide their experiments: Strengthening the signal (i.e., the key information or 'needle') in the context would enable the model to retrieve it more effectively.
The setup included testing GPT-4-Turbo 128K, a variant of the GPT-4 model, in scenarios where it had to retrieve information from large contexts.
The experiments focused on the 'Needle In The Haystack' approach, where a fact was hidden within a document at various positions.
The primary experiment involved testing the model's ability to retrieve a specific fact from different positions within a document, especially in contexts extending over 60K tokens.
The experiment sought to ascertain how the model's performance varied based on the position of the key information within the document.
The experiments revealed a significant decline in the model’s retrieval capabilities in contexts exceeding 60K tokens.
A critical finding was the model's difficulty in locating the 'needle'—the key information—particularly when it was placed between 50% and 70% of the way through the document. This reinforced the challenge language models face in accessing and utilizing information in the middle of lengthy contexts.
You can reproduce their experiments by following along with this notebook.
🛠️ GitHub Gems
My friends at RoboFlow have been having much fun with the new OpenAI Vision API. They’ve got an awesome repo full of fun experiments and projects, like:
The code for all the experiments above is fully open source. You’ll be surprised by how easy it is to start building with the OpenAI API. I recommend checking out these projects, forking the code, and playing around with it to create your project.
The best way to learn is to get hands-on!
📰 Industry Pulse
Amazon's latest move in the AI arena might give us some clues.
In a strategic pivot towards artificial intelligence, Amazon is trimming down its workforce in the Alexa unit, redirecting focus and resources to embrace the burgeoning field of generative AI further.
Daniel Rausch, the VP of Alexa and Fire TV has announced the tough decision to cut several hundred jobs, aiming to realign the company's efforts with customer-centric priorities and the potential of generative AI.
🎧 Ever wondered if you could fine-tune the world around you, choosing exactly what you want to hear?
Researchers at the University of Washington might have just paved the way for such an auditory revolution with their latest AI-powered noise-cancelling technology.
Scientists have developed a noise-cancelling headphones prototype that goes beyond simply blocking out ambient noise.
By leveraging a deep learning AI algorithm, these headphones can filter out specific sounds in real-time, such as birds chirping or car horns, while allowing other selected sounds to reach the wearer.
This "semantic hearing" system streams sounds to a smartphone app, which processes and cancels out all noise except for the categories the user wants to hear.
Paris is set to become the home of a new AI research lab, Kyutai, aiming to push the boundaries of AGI.
In the heart of Paris, a new chapter in AI research is unfolding with the establishment of Kyutai, a nonprofit AI research lab funded by French billionaire Xavier Niel and other philanthropic contributors.
At the ai-PULSE conference, Niel revealed that the lab's funding has nearly tripled, thanks to generous donations, including a significant sum from Rodolphe Saadé, CEO of CMA CGM.
Kyutai's mission is to work on artificial general intelligence, collaborating with top-tier researchers and leveraging cutting-edge technology, such as Nvidia's H100 GPUs provided at cost by Scaleway.
The lab's open science approach encourages publishing research papers, a practice that is becoming increasingly rare in big tech companies.
🌍 Have you ever wondered how technology can make a difference in the lives of refugees seeking a new home?
The story of GeoMatch, a machine learning tool, might just hold the answer.
In a heartwarming blend of community spirit and cutting-edge technology, an article from Science X delves into the tale of Dominik Rothenhaeusler, a statistics professor at Stanford University, and the innovative refugee placement project he's part of.
GeoMatch is designed to optimize the resettlement process for refugees by matching them with communities where they're most likely to succeed, using a wealth of data and machine learning algorithms.
This tool promises to improve the lives of refugees and ease the administrative burden on host countries.
💡 My Two Cents
I’ve got absolutely nothing to say about the OpenAI and Sam Altman drama. I’m just sitting back with 🍿 and seeing how it all unfolds. Honestly, I hope he doesn’t return to OpenAI.
I want him to start his own company.
Cuz maybe they’ll be hiring for a DevRel, and I can apply for a role there 😆.
Here are my favourite sources covering the unfolding corporate drama
AI Explained also released a great video covering the drama.
- breaks down what OpenAI has to do to save itself and critically examines the recent leadership transition at OpenAI, suggesting it indicates internal conflicts and strategic missteps, potentially setting back the company's progress and opening opportunities for competitors.
- published a great piece which covered challenges and contradictions in the company's approach to AI development, sparking broader discussions about the need for transparency, accountability, and openness in the AI industry
🔍 Research Refined: Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models (RALM)
While effective in various tasks, traditional LLMs often face limitations in staying current with facts and minimizing factual inaccuracies, known as 'hallucinations'.
RALMs address these limitations by integrating external knowledge sources, reducing factual hallucinations. This integration allows RALMs to access up-to-date information and specific domain knowledge that might not be in their initial training data. The primary advantage of RALMs lies in their ability to enrich their responses with relevant, current information from external sources.
This is particularly useful in scenarios where the LLMs lack direct knowledge of a subject, enabling them to adapt and respond accurately based on newly acquired information.
Despite the advancements, RALMs encounter significant challenges.
One of the primary issues is the reliability of the retrieved information. Retrieving irrelevant or noisy data can lead to misguided responses, causing the RALM to either provide incorrect answers or overlook its inherent knowledge.
Another critical challenge is the RALM's struggle to assess whether it possesses adequate intrinsic and retrieved knowledge to provide an accurate answer.
This becomes evident in scenarios where the model should ideally respond with "unknown" but fails to do so due to its limitations in identifying the unavailability of pertinent information.
In a typical RALM setup, when a query is received, it is first processed by a component known as the retriever. This retriever searches through a vast evidence corpus to find pertinent documents that might contain information relevant to the query.
Once relevant documents are identified, a reader module examines these documents, extracting useful information. The final response is then formulated based on this extracted information.
This process allows the RALM to integrate relevant external knowledge, enriching its understanding of the input text and generating more accurate and contextually relevant answers.
⛓️📝 Introduction to the Chain of Note (CON) Framework
The CON framework aims to improve RALMs' robustness by generating sequential reading notes for retrieved documents.
This method allows for a comprehensive assessment of each document's relevance and accuracy, filtering out less credible content and enhancing the precision of responses.
The core innovation of the CON framework lies in its generation of sequential reading notes for each retrieved document. These notes systematically evaluate the relevance and accuracy of the information retrieved from external sources.
By creating these reading notes, the model assesses the pertinence of each document to the query and identifies the most critical and reliable pieces of information within these documents. This process effectively filters out irrelevant or less trustworthy content, leading to more accurate and contextually relevant responses.
Types of Reading Notes and Their Functionality
Direct Answer Notes: When a document directly answers the query, the model formulates the final response based on this information. This type of note is used when the retrieved document provides a straightforward answer to the query.
Contextual Information Notes: If the retrieved document does not directly answer the query but provides useful context, the model leverages this information and its inherent knowledge to deduce an answer. This type of note aids in inferential reasoning, where the model combines external data with what it already knows.
Unknown Response Notes: In cases where the retrieved documents are irrelevant and the model lacks sufficient knowledge to answer, it defaults to responding with "unknown." This type of note is crucial for acknowledging the model's limitations and avoiding the propagation of incorrect information.
How these reading notes are generated
Retrieval of Documents: Initially, when a query is inputted into the RALM, the system retrieves potentially relevant documents from a vast knowledge corpus. This retrieval is based on the likelihood that these documents contain information pertinent to the query.
Sequential Reading Notes Generation: The CON framework generates sequential reading notes once these documents are retrieved. This process involves the language model analyzing each retrieved document in the query context. The language model essentially 'reads' through the content of each document and then generates notes that reflect its understanding and evaluation of the information about the query.
Assessment of Relevance and Accuracy: The model assesses each document's relevance to the input query during the reading and note-taking process. It determines whether the information directly answers the query, provides useful context, or is irrelevant. The model also evaluates the accuracy and reliability of the information within these documents. It identifies the most critical and reliable information for formulating an accurate response.
Categorization of Notes: Based on this assessment, the reading notes are categorized into different types. This could include notes indicating a direct answer, notes highlighting useful contextual information, or notes suggesting that the response should be "unknown" due to the lack of relevant or sufficient information.
Integration with Intrinsic Knowledge: Besides generating notes from external sources, the model also integrates its intrinsic knowledge (information it has been trained on) in the note-making process. This helps when external information provides context or partial answers, allowing the model to deduce a more comprehensive response.
Formulating the Final Response: Finally, these reading notes guide the formulation of the final response. The model synthesizes the information from these notes and its inherent knowledge to generate an answer that is precise, contextually relevant, and robust against inaccuracies.
Improvements Brought by Reading Notes
Reducing Misguided Responses: By employing reading notes, the CON framework significantly reduces the chances of the model being misled by irrelevant or noisy data. This is especially crucial when standard RALMs might have provided incorrect answers.
Balancing Knowledge and Acknowledgment of Limitations: The nuanced approach of using different types of reading notes mirrors human information processing. It strikes a balance between direct retrieval, inferential reasoning, and the acknowledgment of knowledge gaps. This leads to a more intelligent and context-aware response system, capable of providing accurate answers and knowing when to admit the lack of information.
That’s it for this one.
See you next week, and if there’s anything you want me to cover or if you have any feedback, shoot me an email.
Thanks for reading The Generative Generation! Subscribe for free to receive new posts and support my work.