Self hosting LLMs on a remote VPS
Self hosting LLMs on a remote VPS
Hi all, I'd like to hear some suggestions on self hosting LLMs on a remote server, and accessing said LLM via a client app or a convenient website. Either hear about your setups or products you got good impression on.
I've hosted Ollama before but I don't think it's intented for remote use. On the other hand I'm not really an expert and maybe there's other things to do like add-ons.
Thanks in advance!
Do you have lots of money? Cuz that's going to cost lots of money. Just get a cheap GPU and run it locally.
That depends on the use-case. An hour of RTX 4090 compute is about $0.69 while the graphics card is like $1,600.00 plus computer plus electricity bill. I'd say you need to use it like 4000h+ to break even. I'm not doing that much gaming and AI stuff, so I'm better off renting some cloud GPU by the hour. Of course you can optimize that, buy an AMD card, use smaller AI models and pay for less VRAM. But there is a break even point for all of them which you need to pass.
Yes, but running an LLM isn't an on-demand workload, it's always on. You're paying for a 24/7 GPU instance if going that route over CPU.
No, but I have free instance on Oracle Cloud and that's where I'll run it. If it's too slow or no good I'll stop using it but there's no harm trying.
I’d be interested to see how it goes. I’ve deployed Ollama plus Open WebUI on a few hosts and small models like Llama3.2 run adequately (at least as fast as I can read) on even an old i5-8500T with no GPU. Oracle Cloud free tier might work OK.