To actually answer this, you could look into free APIs of open source models, which have daily limits but are otherwise largely catch-free. You could even mirror endpoints on your VPS if you need to, or host “middleware” like prompt formatters and enhancers.
I say this because, as others said, you cannot actually host AI on a VPS…
Almost all of Qwen 2.5 is Apache 2.0, SOTA for the size, and frankly obsoletes many bigger API models.