Python RAG Tutorial (with Local LLMs): AI For Your PDFs

176,930

5,586 64

Published 2024-04-17

Learn how to build a RAG (Retrieval Augmented Generation) app in Python that can let you query/chat with your PDFs using generative AI.

This project contains some more advanced topics, like how to run RAG apps locally (with Ollama), how to update a vector DB with new items, how to use RAG with PDFs (or any other files), and how to test the quality of AI generated responses.

👉 Links
🔗 GitHub: github.com/pixegami/rag-tutorial-v2
🔗 Basic RAG Tutorial: • RAG + Langchain Python Project: Easy ...
🔗 PyTest Video: • How To Write Unit Tests in Python • P...

👉 Resources
🔗 Document loaders: python.langchain.com/docs/modules/data_connection/…
🔗 PDF Loader: python.langchain.com/docs/modules/data_connection/…
🔗 Ollama: ollama.com/

📚 Chapters
00:00 Introduction
01:06 RAG Recap
03:22 Loading PDF Data
05:08 Generate Embeddings
07:16 How To Store and Update Data
10:46 Updating Database
11:45 Running RAG Locally
15:12 Unit Testing AI Output
20:29 Wr

All Comments (21)

@vdabhade 1 month ago

It's hard to find such high quality videos which is to the point with simplification in all the aspects. Great work !!!
@tinghaowang-ei7kv 3 months ago

It's hard to find such high quality videos on China's Beep, but you've done it, thank you so much for your selflessness. Great talk, looking forward to the next video. Thanks again, you did a great job!
@musiitwaedmond1426 3 months ago

this is the best RAG tutorial I have come across on youtube, thank you so much man💪
@agustinfilippo5451 21 days ago

I've watched a few of your videos and I didn't know which one to comment first. And congratulate you. Great content and even better style.
@davidtindell950 28 days ago

BTW (ByeTheWay): I used the OpenAI Embeddings model="text-embedding-3-large" and obtained very similar results to your demo query about Monopoly. I first used Ollama 'llama3', but then retested with Ollama 'mistra:latest'. Surprisingly, the 'mistral' results were better than the ''llama3' !?!?! All I can say now is "G'Day Mate" and thank you again!
@frederichominh3152 3 months ago

Best tutorial I've ever seen in a long time, maybe ever. Timing, sequence, content, logic, context... everything is right in your video. Thank YOU and congrats, you are smart as hell.
@denijane89 3 months ago

That was the most useful video I've seen on the topic (and I watched quite a lot). I didn't realise that the quality of the embedding is so important. I have one working code for local pdf ai, but I wasn't very impressed by the results. That explains why. Thank you for the great content. I'd love to see other uses of local LLMs.
@Mykyta-Korniienko-CS 1 month ago

Deploying the model on the cloud would definitely be interesting! thank you for the video :D
@fabsync 2 months ago

Oh man.. by far the best tutorial on the subject.. finally someone using pdf and explaining the entire process! You should do a more in-depth series on this...
@NW8187 2 months ago

Simplifying a complex topic for a diverse set of users requires an amazing level of clarity of thought, knowledge and communication skills, which you have demonstrated in this video. Congratulations! Here are some items on my wish list for you when you can get to it. 1. Ability for users to pick among a selected list of open-source LLMs. A list that users can keep it updated. 2. build a local RAG application for getting insights from personal tabular data, which stored in multiple formats e.g. excel/google sheets, PDF tables
@nachoeigu 3 months ago

Your content is amazing! Keep it going. I would like to see the continuation of this video in terms of how to upload and automate the workflow in the cloud AWS and how to integrate the chat interface with telegram bot
@trueindian03 14 days ago

This is the best RAG tutorial on youtube, Thanks for the Video, you got a new Subscriber 🎉
@JaqUkto 3 months ago

Thank you very much! I've started my RAG using your vids. Of course, much of your code needed to be updated, but it was simple even given my zero knowledge of Python.
@paulham.2447 3 months ago

Very very useful and so much well explained ! Thanks.
@nascentnaga 3 months ago

Suuuuuper helpful. I need to test this for a work idea. thank you!
@joxxen 3 months ago

Very nice, I wish I had this guide few weeks ago, had to learn it the hard way xD
@muhannadobeidat 3 months ago

Great video and nicely scripted. Thanks for the excellent effort. I find that nomic 1.5 is pretty good for embedding and lightweight as well. I did not do actual performance metric based analysis of that but actual recall and precision testing is pretty impressive with 768 dimensions only.
@user-xk3tj5cj8p 3 months ago

Recently discovered your channel 🎉 , subscribed 😊 keep up the awesome content
@zhubarb 3 months ago

Crystal clear. Great video.
@ravikiranbasuthkar2818 21 days ago

This is the best practical tutorial came across llms, RAG, langchain. Also can you make one about agents and their use