Training is by far the most expensive step, but that doesn’t mean a boatload of customers can’t max out your data center.
I think the future is pretty clearly self-hosted models and something like open router, where your query is analyzed and if it requires a frontier model, it is routed to one. Otherwise it goes to ollama on your local machine.
I've actually been hearing this. I thought the training was the most intensive step, but it looks like usage is more costly than I pictured.
If you get Chatgpt's $200 a month plan, you can cost it up to $14,000 a month...
Training is by far the most expensive step, but that doesn’t mean a boatload of customers can’t max out your data center.
I think the future is pretty clearly self-hosted models and something like open router, where your query is analyzed and if it requires a frontier model, it is routed to one. Otherwise it goes to ollama on your local machine.