Pushing This AI Beast To Its Limit (2x RTX 6000)🔥
32,028
Published 2024-06-20
Thanks to Dell and NVIDIA ❤️
Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com/
Need AI Consulting? 📈
forwardfuture.ai/
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: twitter.com/matthewberman
👉🏻 Discord: discord.gg/xxysSXBxFW
👉🏻 Patreon: patreon.com/MatthewBerman
👉🏻 Instagram: www.instagram.com/matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberman_ai
👉🏻 LinkedIn: www.linkedin.com/company/forward-future-ai
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Disclosures:
I am an in
All Comments (21)
-
What should I do next with this machine?
-
Make 100,000 sentences ending with the word Apple
-
Me watching this in 10 years time. Dude I got more power in my smart watch.
-
30k to run a 70b model at under 10t/s that's a big no from me 😂
-
You should use this monster model to set up a series of agents interacting together, each with different specifications. Running in parallel. Like running a village of NPC. Or doing some really complex tasks that require a lot of small tasks. And so on
-
Months from now, LLM technology improves to the point where we can run 1000 models in parallel. Consumer machines have 100 GPUs packed inside. Inference is lightning fast. Tokens per second is staggering. And just when you think you have all the world's power at your fingertips, each model still thinks that marbles somehow stick to cups when you place them upside down inside microwaves.
-
My electricity bills have increased by 800% since I started using ML/LLM models locally :(
-
One area that I am trying to find a better solution for is to have LLM parse PDF documents. Specifically technical documents that have chapters, instructions, and images with captions. It would be interesting to see how the larger models can be used to train smaller ones on this task.
-
You better be running a monster agent system now with this rig. Please ask it some super complicated maybe write a novel
-
I built my own for a little over $10k. Threadripper 7960x, 128gb DDR5 and 3 MSI RTX 4090 Liquid Suprim X in a massive Lian Li V3000 case.
-
How about some multi model agent workflow testing
-
12:34 Sampling steps are set to 30. Go to the settings tab and try the quality option. This should change it to 60 steps. There is also an up scaling option. You can try it with an image you generated by dragging it to the bottom. Test the 2x or whatever is highest. If you really want to push it, keep upscaling the same image.
-
I'm not SHOCKED. I am running 5x P40 (120 GB VRAM) since 1 year. I spent <1000 Euro. No regret.
-
There is a way to load a model into vram once and then have multiple requests using it at the same time. Your demo is loading the same model into memory over and nover.
-
Ahhh I want one! lol! Lucky, you're able to test it out. Congrats!
-
Congratulations! Is this the computer you used for your worm simulation?
-
Can use Vram as one total 96gb or 2 individual 48gb
-
Subscribe to my newsletter for a chance to win a Dell Monitor: gleam.io/otvyy/dell-nvidia-monitor-1
-
Can it run Crysis?
-
OMFG - You're face is melting with happyness! Good for you Matt and thanks for all the great videos! kudos!