Pushing This AI Beast To Its Limit (2x RTX 6000)🔥

32,028

1,264 0

Published 2024-06-20

Learn more about Dell Precision AI-ready workstations - dell.to/3zNGNw7

Thanks to Dell and NVIDIA ❤️

Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/

Join My Newsletter for Regular AI Updates 👇🏼
www.matthewberman.com/

Need AI Consulting? 📈
forwardfuture.ai/

My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: twitter.com/matthewberman
👉🏻 Discord: discord.gg/xxysSXBxFW
👉🏻 Patreon: patreon.com/MatthewBerman
👉🏻 Instagram: www.instagram.com/matthewberman_ai
👉🏻 Threads: www.threads.net/@matthewberman_ai
👉🏻 LinkedIn: www.linkedin.com/company/forward-future-ai

Media/Sponsorship Inquiries ✅
bit.ly/44TC45V

Disclosures:
I am an in

All Comments (21)

@matthew_berman 1 month ago

What should I do next with this machine?
@blazeamerican23 1 month ago

Make 100,000 sentences ending with the word Apple
@trader548 1 month ago

Me watching this in 10 years time. Dude I got more power in my smart watch.
@adamstewarton 1 month ago

30k to run a 70b model at under 10t/s that's a big no from me 😂
@PietroSperonidiFenizio 1 month ago

You should use this monster model to set up a series of agents interacting together, each with different specifications. Running in parallel. Like running a village of NPC. Or doing some really complex tasks that require a lot of small tasks. And so on
@Interloper12 1 month ago

Months from now, LLM technology improves to the point where we can run 1000 models in parallel. Consumer machines have 100 GPUs packed inside. Inference is lightning fast. Tokens per second is staggering. And just when you think you have all the world's power at your fingertips, each model still thinks that marbles somehow stick to cups when you place them upside down inside microwaves.
@hipotures 1 month ago

My electricity bills have increased by 800% since I started using ML/LLM models locally :(
@descmba 1 month ago

One area that I am trying to find a better solution for is to have LLM parse PDF documents. Specifically technical documents that have chapters, instructions, and images with captions. It would be interesting to see how the larger models can be used to train smaller ones on this task.
@ashtwenty12 1 month ago

You better be running a monster agent system now with this rig. Please ask it some super complicated maybe write a novel
@stonedoubt 1 month ago

I built my own for a little over $10k. Threadripper 7960x, 128gb DDR5 and 3 MSI RTX 4090 Liquid Suprim X in a massive Lian Li V3000 case.
@MikeEpler 1 month ago

How about some multi model agent workflow testing
@chadnice 1 month ago

12:34 Sampling steps are set to 30. Go to the settings tab and try the quality option. This should change it to 60 steps. There is also an up scaling option. You can try it with an image you generated by dragging it to the bottom. Test the 2x or whatever is highest. If you really want to push it, keep upscaling the same image.
@dasistdiewahrheit9585 1 month ago

I'm not SHOCKED. I am running 5x P40 (120 GB VRAM) since 1 year. I spent <1000 Euro. No regret.
@TheMattgwapo 1 month ago

There is a way to load a model into vram once and then have multiple requests using it at the same time. Your demo is loading the same model into memory over and nover.
@drewski6843 1 month ago

Ahhh I want one! lol! Lucky, you're able to test it out. Congrats!
@afrikai 1 month ago

Congratulations! Is this the computer you used for your worm simulation?
@360_SA 1 month ago

Can use Vram as one total 96gb or 2 individual 48gb
@matthew_berman 1 month ago

Subscribe to my newsletter for a chance to win a Dell Monitor: gleam.io/otvyy/dell-nvidia-monitor-1
@user-lm4nk1zk9y 1 month ago

Can it run Crysis?
@digitalcivilulydighed 1 month ago

OMFG - You're face is melting with happyness! Good for you Matt and thanks for all the great videos! kudos!