Uncensored self-hosted LLM | PowerEdge R630 with Nvidia Tesla P4

8,521

345 0

Published 2024-07-15

Ollama:
ollama.com/
Ollama UI:
github.com/open-webui/open-webui

Benchark Program:
github.com/ConnorsApps/ollama-benchmarks

VM in k8s: github.com/linuxserver/docker-webtop/
The k8s manifest I used: gist.github.com/ConnorsApps/362b54f92392d93dd5ea6c…

Featured video: "How to install a Graphics Card in a Rack Server with external power supply?"
• How to install a Graphics Card in a R...

What made me pick the Tesla P40: "Use ANY Headless GPU for Gaming in a Virtual Machine!"
• Use ANY Headless GPU for Gaming in a ...

"Local Forecast - Elevator" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
creativecommons.org/licenses/by/4.0/

All Comments (21)

@halo64654 14 days ago

For anyone trying this on old enterprise hardware on top of VMs. Tread carefully with the HPE Gen 7 through 8. There's a bios bug that will not allow you to do PCI passthough and you wont be able to do anything PCI related. Also, underated channel.
@LeeZhiWei8219 yesterday

Interesting! A tour of the homelab maybe? Subscribed!
@alivialee 14 days ago

love the emperor's new groove reference haha
@cifers8928 14 days ago

If you can fit the entire model into your GPU you should use exl2 for free performance gains with no perplexity loss
@JoeCooperTech 6 days ago

Brilliant work. Really well done, Connor. New subscriber here.
@taktarak3869 14 days ago

Thank you. I've been thinking of starting my own home lab for final year project, wasn't able to find a source of where i should start with :) cheers mate
@vulcan4d 2 days ago

A good test would be to show how many tokens/sec you got instead of duration.
@Flight1530 12 days ago

I just found this channel, I hope you do many more LLM with your servers.
@trolledepicpeeterstyle1678 21 days ago

I like this video, keep this up!
@bennett1723 21 days ago

Great video
@DB-dg9lh 9 hours ago

Ya might want to try blur that receipe again. I can read it pretty easily.
@loupitou06fl 2 days ago

Great video, I got my hands on a couple of supermicro 1U servers and tried the 1st part (CPU only) of your video, is there any other GPU that would fit in that slot ?
@roykale9141 14 days ago

Ok this was funny and educative
@shreyasbhat 21 days ago

The title says Tesla P40, but you are using Tesla P4. I'm not sure if the title is wrong or if I got it wrong. Aren't they different GPUs?
@lundylizard 14 days ago

Nice video :)
@AprilMayRain 9 days ago

Have an r720 with a GTX 750ti and need more uses for it! Do you think the 2GB of VM would make any difference for Ollama?
@MrButuz 14 days ago

Good interesting video.
@FroggyTWrite 14 days ago

the r630xd and r730xd have room for a decent sized GPU and PCI-E power connectors you can use with adapters
@internet155 6 days ago

pull the lever kronk
@TheCreaperHead 21 days ago

this was a well made video, is this channel going forward going to be about home lab or server stuff? Im working on my own home lab with Ollama3 with my 3090 fe (ik its overkill lol) and I love seeing ppl make their own stuff. Also, do you know how to make 2 gpus work for Ollama? I added in a 3060ti fe and it isnt being used at all with Ollama3