How to instruction tune a base LLM, making pose estimation easy, and how a domain adapted LLM is being used for chip design.
Plus industry headlines and awesome community events
What’s up, everyone!
Brace yourself for some shameless plugs in this edition. I’ve been up to a lot of cool things that I’m proud to have worked on and excited to share with you.
Thanks for reading The Generative Generation! Subscribe for free to receive new posts and support my work.
Thank you to everyone who joined the LangChain ZoomCamp on Friday.
Don't worry if you missed it; I’ve got the recording for you. I’ll stick to sending the links directly to the Zoom recordings; here’s the link to the Zoom recording where you can watch or download the videos.
I covered ReAct, Tree of Thought, and Retrieval Augmented Generation in this session.
Next week is the final session of the series.
I’ll cover prompt management and versioning. The series went by fast, and I hope you’ve enjoyed it! So, now that this LangChain series is wrapping up…what’s next?
I’m glad you asked!
My third course with LinkedIn Learning will be Retrieval Augmented Generation with LlamaIndex. I’ll start preparing materials soon and kick off the Zoomcamp in late January.
This time around, I’ll do it differently.
Instead of one session per week, I’ll do a daily series where we meet for an hour a day for an entire week, break it up into 2 or 3 cohorts, and gate it with an application process.
This regime will make for a better learning experience for everyone involved.
🗓️ Events from around the community
Nov 13th: Real-Time AI Threat Detection Using Kafka. This event will showcase a demo using a deep learning model and similarity search to classify network intrusion traffic. Using labelled traffic events, you’ll transform them into vector embeddings to measure the similarity between different network events. This will help you identify and classify benign and malicious events.
Nov 16th: DeciDiffusion I’ll talk about latent diffusion models and some training tricks we used to build DeciDiffusion!
My friend (and mentor), Chris Alexiuk, has two events he’d like me to invite you to, and I highly recommend checking them out.
This is part of the AI Makerspace community, one of the best learning platforms for LLMs. I was a student in their cohort on LLMOps in September, and it was so good that I signed up for the current cohort on LLM Engineering.
Their sessions are value-packed and always hands-on. Plus, Chris is an absolute legend:
✨ Blog of the Week
I spent the better part of last week writing a blog to teach you how to instruction-tune a base LLM.
In this blog I focus on the use of Quantized Low Rank Adaptation (QLoRA), a method that significantly reduces memory usage, making the fine-tuning of LLMs more accessible and efficient. QLoRA works by using Low Rank Adaptation (LoRA), which modifies a compact set of model parameters, keeping the majority untouched. This process involves representing weight updates as low-rank matrices, thereby reducing computational resources by focusing on a smaller, more significant subset of parameters.
Here’s a breakdown of what you’ll learn about:
Parameter efficient fine tuning (peft)
How to prepare data for instruction tuning
How to use the Hugging Face SFTTrainer to perform supervised fine-tuning
How to generate text with the fine-tuned model
How to properly time how long a generation takes
🛠️ GitHub Gems
A new state-of-the-art pose estimation model is here: YOLO-NAS-Pose!
What is Pose Estimation?
Pose estimation in computer vision is a technique for detecting human figures and pinpointing their anatomical key points.
It involves identifying specific body parts, such as elbows, knees, and shoulders, and discerning their spatial relationships. This process, known as keypoint detection, requires the model to accurately recognize and localize various body parts in images or videos, often represented as coordinates. The challenge lies in detecting these key points and understanding the body's posture and orientation in three-dimensional space.
This complex task is fundamental in applications like motion capture, athlete performance analysis, and interactive gaming, where understanding human movement and posture is essential.
Challenges in Training Pose Estimation Models
One of the most daunting aspects of training pose estimation models, particularly in PyTorch, is the complexity of creating dataloaders and training loops.
These foundational steps require significant effort and expertise. Developers often grapple with intricacies like data preprocessing, batch handling, and ensuring the efficient training of the model. Each step is crucial and poses challenges, making the process time-consuming and technically demanding.
This complexity underlines the need for innovative solutions to streamline these processes, making model training more accessible and efficient.
Simplifying Training with SuperGradients
SuperGradients addresses the complexities of training pose estimation models in PyTorch by offering a comprehensive and simplified framework. Its combination of built-in models, streamlined lifecycle, unified approach, and ease of integration makes it a valuable tool for developers in the field of computer vision.
Explore with Prepared Notebooks
To see YOLO-NAS Pose in action check out some notebooks I prepared.
Over 1300 people have taken the course in the two weeks since its launch, which is insane to me! It’s free to take if you have a LinkedIn premium account or a subscription to LinkedIn Learning.
Or, you can purchase the course outright for $45.
Free Course Alert: How to Think Like a Data Scientist
Free week is on until Nov 20th at 365 DataScience. That means you can take my course, How to Think Like a Data Scientist, for FREE!
What You'll Learn
This course gives invaluable advice to those who are just starting or about to begin studying data science. You’ll gain an understanding of the type of work you’ll do as a data scientist, why it’s an exciting and demanding job, the best way to build a project portfolio, and, most importantly, develop a mindset that helps you be successful on the job.
Why is data science unique?
The different roles in data science and how to choose the best one for you
What a project should do for you
How to adopt the data science mindset
How to use the scientific method in your work
How to frame problems and ask questions for effective analysis
Think of this course as your opportunity to gain experience and advice from someone who has been on the job for several years and wants to spend time with you and share what he has learned as a senior data scientist.
📰 Industry Pulse
Samsung's latest forum suggests we're on the cusp of such a transformation. At the Samsung AI Forum 2023, the spotlight was on generative AI, a rapidly advancing field promising to revolutionize our daily experiences. Samsung Research led the charge, showcasing their new AI model, Samsung Gauss, and hosting discussions with top minds in the industry. The forum was a melting pot of ideas, with presentations from OpenAI researchers, Korean academics, and Samsung's teams, all delving into the potential and progress of AI technologies.
The latest edition of Startups Weekly dives into the heart of AI's impact on our jobs and the startup ecosystem.
Let's peel back the layers of this digital onion. In a candid exploration, the article suggests that while AI may excel at tasks we're not particularly skilled at, it falls short for experts in creative fields. It's a boon for the average worker, but it also raises questions about job security in a rapidly evolving landscape. The article then takes us on a whirlwind tour of the latest AI developments, the volatile world of venture-backed startups, and the dramatic courtroom saga of a crypto entrepreneur.
With AI's tendrils reaching further into our daily lives and the startup scene, one can't help but wonder: Are we prepared for the seismic shifts AI might bring to our professional landscape?
What's your take on the future of work in the age of AI? Share your thoughts in the comments.
OpenAI's latest announcements suggest a shift that could empower even those without coding experience to create their own AI applications.
In a recent roundup of AI news, OpenAI's developer conference has taken center stage with its showcase of new products and a significant pivot in their business model. The company introduced an improved GPT-4, text-to-speech models, and an API for DALL-E 3. However, the standout was the announcement of GPTs, a new way for developers to create and monetize conversational AI systems through the GPT Store. This move could democratize AI app creation and challenge existing business models in the AI space. With AI becoming more accessible and integrated into various industries, how do you envision these advancements impacting your daily life or profession?
Could we all become AI developers in the near future? Share your thoughts in the comments and let's explore the possibilities together.
Superpowered (now Vapi), a Y Combinator-backed startup, might have an answer with its pivot from an AI-powered notetaker to Vapi, an API for creating natural-sounding voice-based AI assistants.
In a recent development, Superpowered has shifted its focus from enhancing personal productivity through calendar management to enabling developers to craft their own AI assistants. Despite the profitability and user base of their original product, the founders, Jordan Dearsley and Nikhil Gupta, have decided to tackle a more challenging problem: creating a platform that feels human and reduces the clunkiness associated with voice assistants.With the launch of Vapi's API for public use, the startup is not only looking to enhance voice communication but also to eventually develop its own models for audio solutions.
Now, considering the advancements in AI and the potential for more human-like interactions, how do you see voice-based assistants shaping the future of our daily communications and tasks? Share your thoughts in the comments below.
Dragonfly AI is a company that's harnessing these principles to revolutionize the way we predict human attention in the digital world. As AI continues to weave itself into the fabric of our lives, understanding its trajectory and the innovators shaping its future becomes increasingly essential.
In a recent interview, Dragonfly AI shared insights into their biologically driven AI platform, which is designed to predict where human attention will likely fall on visual assets. This technology is not only saving time and resources by replacing traditional pre-flight testing but is also transcending cultural and language barriers, making it a globally applicable tool.
🤔 As we progress fruther into an AI-driven era, how do you see predictive analytics shaping the creative decisions in your industry? Share your thoughts in the comments below
Get a shirt, and support the newsletter.
I keep all my content freely available by partnering with brands for sponsorships.
Lately, the pipeline for sponsorships has been a bit dry, so I launched a t-shirt line to gain community support.
You can check out my designs here and explore the excellent product descriptions I generated using GPT-4!
🔍 Research Refined
Custom tokenizers, domain-adaptive pretraining, supervised fine-tuning, and domain-adapted retrieval models improve LLM performance for chip design tasks. Results show significant performance improvements and up to 5x model size reduction.
Before we dig into the research, let’s get some basics out of the way…
What are Domain-Adapted LLMs?
Domain-Adapted LLMs are specialized versions of LLMs tailored to specific sectors like healthcare, finance, and education. These adaptations address the unique language styles, domain knowledge requirements, and contextual nuances of each field, which are challenging for general-purpose LLMs to handle effectively due to their broad training across diverse data sources.
Why Do We Need Domain-Adapted LLMs?
The need for Domain-Adapted LLMs arises from the complexity and specificity of problems in various domains. General-purpose LLMs, while powerful, struggle with the nuances of domain-specific language, the depth of professional knowledge required, and the unique constraints of each field, including ethical and cultural considerations. Customizing LLMs to specific domains ensures they perform effectively and responsibly in these specialized contexts.
Role of Retrieval-Augmented Generation (RAG) in Domain-Adapted LLMs
RAG is an AI framework that improves LLM responses by grounding models in external, current, and reliable facts. This framework is especially crucial in domain-adapted LLMs, as it provides them access to specialized knowledge beyond their training data, enabling them to answer specific questions more accurately. In essence, RAG acts as an open-book resource for LLMs, enhancing their responses with up-to-date and relevant information.
Summary of the ChipNeMo Paper
The ChipNeMo paper explores the use of domain-adapted LLMs in industrial chip design.
It utilizes techniques like custom tokenizers, domain-adaptive continued pretraining, supervised fine-tuning, and domain-adapted retrieval models. The paper evaluates these methods in three LLM applications: an engineering assistant chatbot, EDA script generation, and bug summarization and analysis.
The results show significant improvements in LLM performance over general-purpose models, enabling model size reductions while maintaining or enhancing performance.
The ChipNeMo project employs several domain adaptation techniques to tailor LLMs for chip design:
Custom Tokenizers: The process involves creating a tokenizer from scratch with domain-specific data, identifying unique tokens, expanding the general-purpose tokenizer with these new tokens, and initializing their embeddings. This approach improves tokenization efficiency for chip design data while maintaining general dataset efficiency.
Domain-Adaptive Pretraining (DAPT): DAPT is applied to pre-trained foundational models like LLaMA2. It involves further pretraining on domain-specific data using standard autoregressive language modeling objectives, ensuring training efficiency through techniques like tensor parallelism and flash attention.
Supervised Fine-Tuning (SFT): Post-DAPT, models undergo SFT for alignment with domain-specific tasks. This process includes combining a domain SFT dataset with a general chat SFT dataset and employing an autoregressive optimization objective to focus on optimizing answer tokens.
Retrieval-Augmented Generation (RAG): To mitigate the issue of LLMs generating inaccurate text ("hallucination"), RAG is used. It retrieves relevant passages from a database to be included in the prompt along with the question, grounding the LLM for more accurate answers. The domain-adapted language model for RAG is further fine-tuned with domain-specific training data to improve retrieval accuracy.
- Enhanced Performance in Specific Applications: Domain-adapted LLMs significantly outperform general-purpose models in chip design tasks like chatbot assistance, script generation, and bug analysis.
- Efficiency in Model Size and Cost: The domain adaptation allows for up to a 5x reduction in model size without compromising performance, offering cost and efficiency benefits.
- Customized Tokenizers and SFT Impact: Custom tokenizers reduce token count effectively, and supervised fine-tuning with additional domain-specific instructions markedly improves application proficiency.
- Improved Retrieval Accuracy: Fine-tuning the retrieval model with domain-specific data enhances the retriever hit rate by 30%, improving the overall quality of RAG responses.
- Gap Between Current and Ideal Outcomes: Despite the advancements, there remains room for improvement, indicating the potential for further research and development in this area.
That’s it for this one.
See you next week, and if there’s anything you want me to cover or have any feedback, shoot me an email.
Thanks for reading The Generative Generation! Subscribe for free to receive new posts and support my work.