Uncensored self-hosted LLM | PowerEdge R630 with Nvidia Tesla P4

8,521
0
Published 2024-07-15
Ollama:
ollama.com/
Ollama UI:
github.com/open-webui/open-webui


Benchark Program:
github.com/ConnorsApps/ollama-benchmarks

VM in k8s: github.com/linuxserver/docker-webtop/
The k8s manifest I used: gist.github.com/ConnorsApps/362b54f92392d93dd5ea6c…

Featured video: "How to install a Graphics Card in a Rack Server with external power supply?"
   • How to install a Graphics Card in a R...  

What made me pick the Tesla P40: "Use ANY Headless GPU for Gaming in a Virtual Machine!"
   • Use ANY Headless GPU for Gaming in a ...  

"Local Forecast - Elevator" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
creativecommons.org/licenses/by/4.0/

All Comments (21)
  • @halo64654
    For anyone trying this on old enterprise hardware on top of VMs. Tread carefully with the HPE Gen 7 through 8. There's a bios bug that will not allow you to do PCI passthough and you wont be able to do anything PCI related. Also, underated channel.
  • Interesting! A tour of the homelab maybe? Subscribed!
  • @alivialee
    love the emperor's new groove reference haha
  • @cifers8928
    If you can fit the entire model into your GPU you should use exl2 for free performance gains with no perplexity loss
  • @JoeCooperTech
    Brilliant work. Really well done, Connor. New subscriber here.
  • @taktarak3869
    Thank you. I've been thinking of starting my own home lab for final year project, wasn't able to find a source of where i should start with :) cheers mate
  • @vulcan4d
    A good test would be to show how many tokens/sec you got instead of duration.
  • @Flight1530
    I just found this channel, I hope you do many more LLM with your servers.
  • @DB-dg9lh
    Ya might want to try blur that receipe again. I can read it pretty easily.
  • @loupitou06fl
    Great video, I got my hands on a couple of supermicro 1U servers and tried the 1st part (CPU only) of your video, is there any other GPU that would fit in that slot ?
  • @shreyasbhat
    The title says Tesla P40, but you are using Tesla P4. I'm not sure if the title is wrong or if I got it wrong. Aren't they different GPUs?
  • @AprilMayRain
    Have an r720 with a GTX 750ti and need more uses for it! Do you think the 2GB of VM would make any difference for Ollama?
  • @FroggyTWrite
    the r630xd and r730xd have room for a decent sized GPU and PCI-E power connectors you can use with adapters
  • @TheCreaperHead
    this was a well made video, is this channel going forward going to be about home lab or server stuff? Im working on my own home lab with Ollama3 with my 3090 fe (ik its overkill lol) and I love seeing ppl make their own stuff. Also, do you know how to make 2 gpus work for Ollama? I added in a 3060ti fe and it isnt being used at all with Ollama3